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(57) Abstract 

The invention relates to infectious cDNA clones for Dengue 2 virus, straiiU$681, and its live, attenuated vaccine derivative, PDK-53 
(DEN-2 PDK-53). The invention also relates to infectious cDNA clones for chimeric viruses characterized as expressing structural genes 
of a Dengue 1, Dengue 3, or Dengue 4 attenuated virus in the context of the nonstnictural genes of the Dengue 2 PDK-53 virus (DEN-2/1 , 
DEN-2/3 DEN-2/4). The invention further relates to genetic constructs encoding these^cDNAs, and host cells containing these constructs. 
The invention moreover relates to quadravalent vaccines providing immunity against all four serotypes of dengue virus comprising DEN-2 
PDK-53 infectious clone derivative, DEN-2/1, DEN-2/3, or DEN-2/4 viruses, and relate:' methods of immunization. 
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INFECTIOUS DENGUE 2 VIRUS PDK-53 AS QUADRA VALENT VACCINE 



Field of the Invention 

The invention relates to infectious cDNA clones for 
Dengue 2 virus, strain 16681, and its live, attenuated 
vaccine derivative, PDK-53 (DEN- 2 PDK-53) . The invention 

10 also relates to infectious cDNA clones for chimeric 

viruses characterized as expressing structural genes of a 
Dengue 1, Dengue 3, or Dengue 4 attenuated virus in the 
context of the nonstructural genes of the Dengue 2 PDK-53 
virus (DEN-2/1, DEN-2/3, DEN-2/4) . The invention further 

15 relates to genetic constructs encoding these cDNAs, and 
host cells containing these constructs. The invention 
moreover relates to quadravalent vaccines providing 
immunity against all four serotypes of dengue virus 
comprising DEN- 2 PDK-53 infectious clone derivative, DEN- 

20 2/1, DEN-2/3, or DEN-2/4 viruses, and related methods of 
immunization . 

Background of the Invention 
Arthropod- borne viruses (arboviruses) are a diverse 
25 group of viruses that have been lumped together on the 

basis of their ecological niche, which involves cycles of 
transmission between vertebrate hosts and arthropod 
vectors such as mosquitos and ticks. The prototype 
arbovirus is yellow fever virus, a flavivirus, which was 
30 isolated in 1927. In the 1950s, the Rockefeller 

Foundation established a number of field stations in 
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various tropical countries for the purpose of isolating 
new viruses. The 1985 International Catalogue of 
Arboviruses Including Certain Other Viruses of Vertebrates 
contains registrations for 504 discrete arboviruses, 124 
5 of which have caused disease in humans. Thirty- four 

viruses of the Flavivirus genus (family Flavivir idae ) of 
arboviruses are human pathogens (Karabatsos, 1985) . (All 
publications cited hereunder are incorporated herein by 
reference. ) 

10 According to a 1992 World Health Organization (WHO) 

* 

press release (Press Release WHO/74, November 24, 1992), 
dengue hemorrhagic fever is one of the most important and 
increasing mosquito-transmitted infections in the world, 
with more than 85 countries in Asia, the Pacific Islands, 

15 Africa, Central America, and South America being 

threatened with dengue outbreaks. Dengue fever was known 
in the past as "breakbone fever" due to the severe 
muscular and joint pain that accompanied the high fever 
during this infection. Dengue is an under- reported 

20 disease: it is thought that millions of cases occur each 
year. 

Dengue (DEN) viruses, which are f laviviruses, are 
classified antigenically into 4 serotypes (DEN-1, DEN- 2, 
DEN- 3, and DEN-4) . Multiple serotypes are now endemic in 

25 most countries in the tropics. DEN viruses are 

transmitted to humans principally by Aedes aegypti 
mosquitos throughout much of the tropical and subtropical 
region of the world. Viruses of all four serotypes infect 
humans and cause clinically inapparent infection or 

30 illness ranging from dengue fever to severe and often 
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fatal dengue hemorrhagic fever/dengue shock syndrome 
(DHF/DSS) . DHF/DSS has been associated epidemiologically 
and experimentally with immune enhancement of virus 
replication by preexisting, subneutralizing levels of 
5 heterotypic antibody. About 90% or more of patients with 
DHF/DSS are children who are 14 years old or younger 
(Halstead, 1970; Halstead, 1988) . Case fatality rates in 
untreated individuals can be as high as 15-20%. Between 
1956 and 1978, hospitalization of more than 350,000 dengue 

10 patients and about 12,000 deaths in Southeast Asia were 
reported to the WHO (Halstead, 1980) . More recent dengue 
epidemics in Asia, the Pacific islands, the Americas, and 
Africa indicate that the incidence, with up to 40 million 
cases annually, and geographic distribution of the disease 

15 is increasing in Aedes aeqypti- infested areas of the world 
(Halstead, 1984; Gubler, 1988; Brandt, 1990). 

Since eradication of Aedes aegypti mosquitos appears 
to be practically infeasible, development of safe, 
effective vaccines against all four serotypes of DEN virus 

20 is a WHO priority (Gubler, 1988; Brandt, 1988; Brandt, 
1990) . Since the level of DEN virus replication in 
certified cell cultures yields insufficient antigenic mass 
to produce effective inactivated vaccines, priorities are 
given to developing effective live, attenuated vaccine 

25 viruses and using a variety of expression systems such as 
recombinant vaccinia or avipox virus (live vaccine) , 
recombinant baculovirus (subunit vaccine) , and recombinant 
E. coli (subunit vaccine) to express certain genes of the 
DEN viral genome (Brandt, 1988; Brandt, 1990) . 
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Flaviviruses are enveloped RNA viruses 45 to 50 nm in 
diameter that contain a single-stranded, positive-sense 
capped RNA genome of approximately 11 kb. • The RNA genome 
does not have a 3 '-terminal poly (A) tail. Because the 
5 genetic molecule of flaviviruses is positive or messenger 
RNA (mRNA) -sense, naked genomic RNA injected, transfected, 
or electroporated into mammalian or invertebrate cells is 
capable of associating directly with the ribosomal protein 

■ 

synthetic machinery of the cell. All of the viral 

10 proteins are translated from the inserted viral genomic 
mRNA. These virus-specified proteins then replicate the 
viral genome, resulting in intracellular virus maturation 
and release of infectious virus from the transfected cell. 
The gene organization of the flavivirus mRNA genome, 

15 illustrated below, is 5 f -noncoding region {5 ' -NC) -capsid- 
premembrane /membrane (prM/M) -envelope (E) -nonstructural 
protein 1 (NS1) -NS2A-NS2B-NS3 -NS4A-NS4B-NS5 - 3 ' -noncoding 
region (3'-NC). The structural proteins capsid, prM/M, 
and E and nonstructural proteins are translated as a large 

20 precursor polyprotein molecule from a single long open 
reading frame in the mRNA genome. The individual mature 
viral proteins are processed from the polyprotein by both 
cell and virus specified proteases (Westaway et al., 1985; 
Coia et al . , 1988; Speight and Westaway, 1989; Rice et 

25 al. , 1985) . 

Genome Organization of Dengue Virus and Other Flaviviruses 

j | C | | M 1 E 1 NS1 I 2A | 2B | NS3 1 4A 1 4B | NS5 h'-Nc) 

30 
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The structural proteins are those viral proteins that 
are incorporated into the mature virion. The virion 
consists of an icosahedral capsid (C) that packages the 
viral genomic mRNA (nucleocapsid) . The nucleocapsid is 
5 surrounded by a cell -derived lipid membrane into which the 
envelope (E) and mature membrane (M) proteins are 
imbedded. The virus -specific nonstructural genes, NS1- 
NS5, are expressed in the cytoplasm of the infected cell 
and are involved in the replication and maturation of the 

10 viral RNA genome and viral proteins. 

The E glycoprotein of the virus is exposed to the 
environment and is involved in attachment and entry of the 
virus into the cell. The E protein is the primary viral 
immunogen against which the infected vertebrate host 

15 develops virus-specific neutralizing antibody. The E gene 
is the most common target for development of molecular 
systems to express the encoded E glycoprotein. However, 
immunization with various purified nonstructural genes of 
the virus have been shown to elicit protective immunity 

20 against challenge with wild- type virus, probably via 

cytotoxic T-cell mediated lysis of infected cells which 
express viral nonstructural proteins on the cell surface. 

Vaccination can be one of the most cost effective 
ways to prevent dengue fever and DHF/DSS. Since 1979 the 

25 WHO has supported research on dengue vaccine development 
at the Mahidol University in Bangkok, Thailand (Press 
Release WHO/74, November 24, 1992) . Investigators at 
Mahidol University have developed four live, attenuated 
candidate vaccine viruses, one for each of the four 

30 serotypes, by serial passage of the virulent parent 
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viruses in primary dog kidney (PDK) or fetal rhesus lung 
(FRhL) cell culture (Yoksan et al., 1986; Bhamarapravati 
et al., 1987). Phase 1 and Phase 2 clinical trials in 
Thailand have demonstrated that the vaccine is both safe 
5 and immunogenic in humans. The vaccines now need to be 
tested for efficacy in large numbers of children (Press 
Release WHO/74, November 24, 1992). To preclude the 
possible severe DHF/DSS immune enhancement phenomenon in 
vaccinees who might be infected naturally with a 
10 heterologous serotype of wild- type DEN virus following 
immunization with a single serotype of vaccine virus, it 
is essential that humans be vaccinated with a quadravalent 
vaccine to provide immunity against all four serotypes of 
the virus. 

15 

ffinrnnary Of the Inve^ipp 
The invention provides a quadravalent vaccine 
providing immunity against all four serotypes of dengue 
virus comprising a DEN- 2 PDK-53 infectious clone-derived 
20 virus . 

The invention also provides a quadravalent 
vaccine providing immunity against all four serotypes of 
dengue virus comprising a chimeric DEN-2/1 virus. 

The invention further provides a quadravalent 
25 vaccine providing immunity against all four serotypes of 
dengue virus comprising a chimeric DEN-2/3 virus. 

The invention moreover provides a quadravalent 
vaccine providing immunity against all four serotypes of 
dengue virus comprising a chimeric DEN- 2/4 virus. 
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The invention additionally provides a 
quadravalent vaccine providing immunity against all four 
serotypes of dengue virus comprising DEN -2 PDK-53 
infectious clone-derived and chimeric DEN-2/1, DEN-2/3, 
5 and DEN-2/4 viruses. 

In another aspect, the invention provides a 
method of immunization in which a desired immune response 
is produced against all four serotypes of dengue virus 
comprising the step of administering to a subject a 
10 quadravalent vaccine comprising DEN- 2 PDK-53 infectious 
clone-derived and chimeric DEN-2/l / DEN-2/3, and DEN-2/4 
viruses . 

In yet another aspect, the invention provides a 
composition of matter comprising a full genome- length 
15 infectious cDNA clone for a DEN- 2 virus, strain 16681. 

The invention also provides a composition of 
matter comprising a full genome -length infectious cDNA 
clone for a DEN- 2 virus of a strain characterized as 
replicating to high titer in cell culture. 
20 The invention further provides a composition of 

matter comprising a full genome-length infectious cDNA 
clone for a DEN- 2 virus, strain 16681, having the 
identifying characteristics of ATCC 69826. 

In still another aspect, the invention provides 
25 a composition of matter comprising a full genome -length 
infectious cDNA clone for a DEN- 2 virus, strain 16681, 
attenuated derivative, PDK-53. 

The invention also provides a composition of 
matter comprising a full genome-length infectious cDNA 
30 clone for a DEN- 2 virus attenuated derivative, 
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characterized as replicating to high titer in cell 
culture . 

The invention further provides a composition of 
matter comprising a full genome -length infectious cDNA 
5 clone for a DEN- 2 virus, strain 16681, attenuated 

derivative, PDK-53, having the identifying characteristics 
of ATCC 69825 . 

In another aspect, the invention provides a 
composition of matter comprising a full genome- length 

10 infectious cDNA clone of a chimeric DEN-2/1 virus, wherein 
the virus is characterized as expressing the prM and E 
genes of a DEN-1 attenuated virus in the context of the 
nonstructural genes of the DEN- 2 PDK-53 virus. The DEN-1 
attenuated virus may be DEN-1 PDK-13. 

15 The invention also provides a composition of 

matter comprising a full genome- length infectious cDNA 
clone of a chimeric DEN- 2 virus, wherein the virus is 
characterized as expressing the antigenicity of a DEN-l 
attenuated virus. 

20 In yet another aspect, the invention provides a 

composition of matter comprising a full genome -length 
infectious cDNA clone of a chimeric DEN-2/3 virus, wherein 
the virus is characterized as expressing the prM and E 
genes of a DEN- 3 attenuated virus in the context of the 

25 nonstructural genes of the DEN- 2 PDK-53 virus. The DEN- 3 
attenuated virus may be DEN- 3 PGMK30/FRhL-3 . 

The invention also provides a composition of 
matter comprising a full genome -length infectious cDNA 
clone of a chimeric DEN- 2 virus, wherein the virus is 
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characterized as expressing the antigenicity of a DEN- 3 
attenuated virus. 

In still another aspect, the invention provides 
a composition of matter comprising a full genome -length 
5 infectious cDNA clone of a chimeric DEN-2/4 virus, wherein 
the virus is characterized as expressing the prM and E 
genes of a DEN- 4 attenuated virus in the context of the 
nonstructural genes of the DEN- 2 PDK-53 virus. The DEN -4 
attenuated virus may be DEN-4 PDK-48. 

10 The invention also provides a composition of 

matter comprising a full genome -length infectious cDNA 
clone of a chimeric DEN- 2 virus, wherein the virus is 
characterized as expressing the antigenicity of a DEN-4 
attenuated virus. 

15 Additionally, the invention provides a genetic 

construct comprising a DNA sequence operably encoding the 
polyprotein of DEN- 2 virus, strain 16681. The polyprotein 
may be the polyprotein encoded by the nucleotide sequence 
of SEQ ID NO:l. 

20 The invention also provides a genetic construct 

comprising a DNA sequence operably encoding at least one 
protein of DEN- 2 virus, strain 16681. The protein may be 
a protein encoded by the nucleotide sequence of SEQ ID NO: 
1. 

25 Further, the invention provides a genetic 

construct comprising a DNA sequence operably encoding the 
polyprotein of DEN- 2 virus, strain 16681, attenuated 
derivative, PDK-53. The polyprotein may be the 
polyprotein encoded by the nucleotide sequence of SEQ ID 

30 NO:2. 
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The invention also provides a genetic construct 
comprising a DNA sequence operably encoding at least one 
protein of DEN- 2 virus, strain 16681, attenuated 
derivative, PDK-53. The protein may be a protein encoded 

5 by the nucleotide sequence of SEQ ID NO: 2. 

Moreover, the invention provides a genetic 
construct comprising a DNA sequence operably encoding at 
least one structural protein of DEN-1 PDK-13. The 
structural protein may be a structural protein encoded by 

0 the nucleotide sequence of SEQ ID NO: 124. 

In another aspect, the invention provides a 
genetic construct comprising a DNA sequence operably 
encoding at least one structural protein of DEN- 3 
PGMK30/FRhL-3 . The structural protein may be a structural 

5 protein encoded by the nucleotide sequence of SEQ ID NO: 
125. 

In still another aspect, the invention provides 
a genetic construct comprising a DNA sequence operably 
encoding at least one structural protein of DEN-4 PDK-48. 
0 The structural protein may be a structural protein encoded 
by the nucleotide sequence of SEQ ID NO: 126. 

In yet another aspect, the invention includes a 
host cell comprising any of the above genetic constructs. 

5 Brief Description of the Drawings 

Figure 1: Strategy for construction of the full 
genome-length cDNA clone of DEN-2 virus. Using PCR 
technology, cDNA is amplified from the genomic RNA of the 
virus and cloned. Subclones are spliced together at 

* 

3 unique, overlapping restriction enzyme sites to construct 
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the full genome -length clone. Numbered arrows upstream 
(right arrows) and downstream (left primers used to 
amplify the cDNA in PCR reactions. 

Figure 2: Transcription of genomic mRNA from the 
5 full-length infectious cDNA clone of DEN- 2 virus. The 
recombinant plasmid is linearized at the unique Xbal site 
at the 3'-end of the genomic cDNA. Bacteriophage T7 RNA 
polymerase recognizes the T7 promoter engineered at the 
5' -end of the cDNA and transcribes full-length viral mRNA 

10 from the cDNA template. 

Figure 3: Restriction enzyme sites identified in the 
nucleotide sequence of the RNA genome of DEN- 2 16681 
virus. Locations for the sites are indicated by the 
genome nucleotide numbers. Restriction enzymes that 

15 cleave the DEN- 2 genomic cDNA at only a single location 
are listed vertically at the top of the figure. The 
resolution of the RENZ graph is 97.5 nucleotides per dot. 

Figure 4: Growth curve of DEN- 2 16681 virus in C6/36 
mosquito cells. 

20 Figure 5: (A) Polaroid prints showing RT/PCR 

amplification of the entire mRNA genome of DEN- 2 virus, 
strain 16681, in the form of 5 cDNA amplicons. The 
molecular weight marker (MW) consists of linear, double - 
stranded DNA markers of various base pair (bp) lengths. 

25 The top 2 gels show S-fil aliquot s of the original RT/PCR 
reactions. The bottom two gels show 10% of the yield 
following HMC agarose gel purification of the remaining 
95-/xl reaction aliquots. (B) Primers (amplimers) used in 
the RT/PCR reactions and the expected sizes of the 

30 resulting cDNA amplicons. 
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Figure 6: EcoRI restriction enzyme digests of F2, 
F2-Sal, Sal-F2 / and F3 miniprep recombinant plasmid DNA. 
Plasmids from individual colonies resulting from 
transformation with independent ligated, recombinant 
5 plasmid molecules are numbered. The insert in the single 
F2-8 plasmid was too small and was discarded. The 
remaining recombinant plasmids contained cDNA inserts of 
expected size. As expected, F2-Sal cDNA contained two 
.internal EcoRI sites; the Sal-F2 and F3 plasmids contained 

10 a single internal EcoRI site. EcoRI digestion of the 
recombinant plasmids regenerated linearized, wild-type 
3.9-kb pCRII vector. For an undetermined reason, one of 
the EcoRI sites in plasmid F3-1 did not cut. 

Figure 7; Schematic diagram showing the genomic 

15 locations of DEN- 2 16681 virus-specific cDNA clones. 

Clones indicated with asterisks were spliced together at 
the indicated restriction enzyme sites to construct the 
full genome-length cDNA clone. Black horizontal bars 
indicate clone regions that were sequenced. Light gray 

20 regions of horizontal bars indicate clone regions that 
were not sequenced. 

Figure 8: <A) Effect of adding Taq extender reagent 
to PCR reactions. The 5.2-kbp amplicon of St. Louis 
encephalitis virus was readily obtained by extended PCR 

25 ( + ) but not by standard PCR (-) . (B) Agarose gel 

electropherogram showing DEN- 2 PDK-53 Fl, F2, and F3 
ampl icons derived by extended PCR. 

Figure 9: Schematic diagram showing the genome 
locations of errors identified in the cDNA clones of DEN -2 

30 16681. Errors are indicated by short vertical tick marks. 
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Figure 10: Schematic diagram illustrating the 
approximate genome locations of the nucleotide 
discrepancies between the data of Applicants and those of 
Blok et al. (1992) for the sequence of the genome of DEN- 2 
5 virus, strain 16681. 

Figure 11: Nucleotide sequence of the genome of DEN- 
2 strain 16681 virus. Differences between the data 
determined by Blok et al. (1992) (DEN-2-16681.BLOK) and 
those obtained by Applicants (DEN-2-16681.RK) . The genome 

10 nucleotide positions of the sequence differences are 
listed vertically. The solid squares indicate those 
nucleotide differences that also encode amino acid 
substitutions. The remaining nucleotide differences are 
either silent, encoding the same amino acid, or lie within 

15 the 5 1 -noncoding (5 ' -NC) or 3 1 -noncoding region (3 1 -NC) . 

Figure 12: Schematic diagram showing the DEN- 2 PDK- 
53 virus -specific cDNA clones and the approximate 
locations of cDNA errors (vertical tick marks) identified 
by nucleotide sequence analyses. Clones marked with an 

20 asterisk were used in the construction of the DEN- 2 PDK-53 
virus- specific full-length cDNA clone. Clone #19 had a 
203-bp deletion (horizontal line) . 

Figure 13: Schematic summary of the DEN- 2 16681 vs. 
PDK-53 virus sequencing projects. Arrows indicate the 

25 nucleotide differences detected between the two genomes. 
Triangles indicate those nucleotide changes that resulted 
in amino acid substitutions. 

Figure 14: Finalized nucleotide and amino acid 
sequence of the RNA genome of DEN- 2 virus, strain 16681 

30 (SEQ ID NO:l). The nucleotide and amino acid mutations 
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that were determined to have occurred in DEN- 2 virus, 
strain PDK-53, are indicated at the appropriate positions 
(SEQ ID NO: 2) . The EcoRI, SstI, Mull, and T7 promoter 
sites that were engineered immediately preceding the 5 1 - 
5 terminal nucleotide of the virus-specific genomic cDNA are 
shown. The start positions of the viral genes and 
noncoding regions (5'-NC and 3 1 -NC) are shown. Potential 
sites of Asn-linked glycosylation (Asn-X-Ser or Thr, where 
X = any amino acid) in prM f E, and NS1 are indicated by 

10 asterisks. The deduced amino acid sequence is indicated 
in standard single-letter abbreviation: A = Ala, C = Cys, 
D = Asp, E = Glu, F = Phe, G = Gly, H = His, I = He, K = 
Lys, L = Leu, M = Met, N - Asn, P = Pro, Q = Gin, R = Arg, 
S = Ser, T = Thr, V = Val, W = Trp, Y = Tyr. 

15 Figure 15: Construction of intermediate clone F2 by 

ligating the F2-Sal Sphl/Hpal fragment and Sal-F2 
Hpal/Kpnl fragment into pUC18. The resulting F2 clone 
contained a nonsilent cDNA error at genome nucleotide 
position 1730. 

20 Figure 16: Correction of the intermediate F2 clone. 

A new PCR amplicon was cloned and sequenced. The 
Sphl/Hpal fragment of this clone was spliced into F2 to 
construct F2-C having the correct nucleotide at genome 
position 1730. 

25 Figure 17: Construction of the intermediate Fl/3/4/5 

cDNA clone for DEN- 2 16681 virus. The thick solid black 
bars indicate DEN- 2 virus- specif ic cDNA, illustrated with 
the RENZ sites of the MCS of the plasmid. The RENZ sites 
used in each step of the splicing strategy are indicated 

30 in underlined, bold characters. The top half of the 
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figure shows construction of Fl/3/4/5-pUC18 . The bottom 
portion of the figure illustrates the making of Fl/3/4/5- 
pUC19. The final step in the construction of the full 
genome -length cDNA clone involved the ligation of the F2-C 
5 Sphl/Kpnl cDNA fragment into plasmid containing cDNA 

Fl/3/4/5 and cut with RENZs Sphl/Kpnl. Although F2-C cDNA 
could not be cloned into Fl/3/4/5-pUC18, it was readily 
cloned into Fl/3/4/5-pUC19 . The pUC18 plasmid containing 
a small insert of cDNA made for Venezuelan equine 

10 encephalitis (VEE) virus was used simply to move Fl and 
F4/5 into pUC18 in a 3 -molecule ligation reaction. The 
VEE virus- specif ic cDNA was spliced out during this 
process. Arrowheads under cDNA bars indicate orientation 
of mRNA- sense cDNA strand. 

15 Figure 18: Orientation specific cloning of full 

genome-length cDNA of DEN- 2 16681 virus into the multiple 
cloning site of pUC19. Although the full-length cDNA was 
readily cloned in pUC19, multiple attempts to insert the 
cDNA into pUC18 failed. Presumably, interaction of the 

20 cDNA with pUC18 -specific gene transcripts, translation of 
a toxic DEN- 2 polypeptide, or translation of a toxic 
pUC18/DEN-2 fusion polypeptide produced deleterious 
effects in E. coli. Large arrows indicate orientation of 
mRNA- sense cDNA strands in the pUC plasmid backbone. 

25 Smaller arrows indicate orientations of the lac Z and 
ampicillin genes as well as the origin of replication. 
DEN- 2 insert is indicated by a thick solid black line. 

Figure 19: Insertion of the MCS of plasmid pUC19 
into pBR322 in both orientations to construct pBRUC-138 

30 and pBRUC-139. The pUC18 Hindlll (blunt-ended = BL) /EcoRI 
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MCS fragment was ligated into pBR322 cut with Aval 

(BL) /EcoRI to construct pBRUC-138. The pUC18 EcoRI 

(BL) /Hindi I I MCS fragment was ligated into pBR322 cut with 

Aval (BL) /Hindi I I to make pBRUC-139. In both cases, the 
5 tetracycline gene of pBR322 was removed. pBRUC-138 = 

» 

2992-bp (61-bp MCS + 2931-bp pBR322 deletion vector) . 
pBRUC-139 = 3022-bp (61-bp MCS + 2961-bp pBR322 deletion 
vector) . Orientations of ORI, ROP, and the Amp gene are 
indicated . 

10 Figure 20: Construction of pD2/IC-30P, the full 

genome-length cDNA clone of DEN- 2 16681 virus, in plasmid 
pBR322 (pBRUC-139 (SphI-) derivative) . The F3/4/5 clone 
cDNA was ligated into pBRUC-139 first (Top of Figure) , 
followed by Fl-E and F2-C. Viable, infectious DEN- 2 virus 

15 was successfully obtained from viral mRNA transcribed from 
this clone. 

Figure 21: Construction of pD2/IC-130V, the full 
genome-length cDNA clone of DEN- 2 PDK-53 virus. A 
nonsilent error in cDNA clone F3-3C was corrected by 

20 splicing in a correct BstBI/Nhel fragment from clone F3.5- 
6 (Top) . The resulting corrected clone F3-3CC was spliced 
into the 16681 F345-F clone in pBRUC-139. cDNA fragments 
F1-79B, F2-16B, and the recombinant F3/4/5 vector DNA were 
spliced together in a single ligation reaction to produce 

25 pD2/IC-130V. The Nhel site occurs at genome nucleotide 

position 6646. Therefore, the PDK-53 virus-specific full- 
length cDNA clone contains the parental 16681 virus- 
specific nucleotide at position 8571. This nucleotide 
difference is silent; it does not encode an amino acid 

30 change. Other than the 8571 position, DEN-2 16681 and 
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PDK-53 viruses are identical in nucleotide sequence from 
nucleotide position 6646 to the 3' terminus of the genome. 

Figure 22: Agarose gel electropherogram of viral 
genomic mRNA extracted from gradient -purified, wild- type 
5 DEN- 2 16681 virus and Venezuelan equine encephalitis (VEE) 
virus. The quantity of RNA loaded onto the gel ranged 
from 22 ng to 383 ng. The stock RNA was quantitated 
spectrophotometrically at 260 nm. The genome-length RNA 
band is clearly visible between the 4153-bp and 6788-bp MW 

10 marker bands. Bands were visualized by incorporating 200 
ng/ml of ethidium bromide stain in the gel and 
electrophoresis buffer. 

Figure 23: Transcription of RNA from pVE/IC-92 (VEE 
virus clone) and pD2/IC-20 (DEN- 2 16681 virus clone) . 

15 Transcription reaction conditions (100 ng linearized DNA 
template, 12.5 mM DTT, 2.7 u/fil RNasin, 0.15 mM NTPs, 3.3 
U//il T7 RNA polymerase (Stratagene) in commercial buffer 
(Stratagene) ) yielded high quantity and quality of 
infectious mRNA transcripts from the pVE/IC-92 clone and 

20 3»-end truncation products of that clone. However, these 
reaction conditions failed to permit transcription of RNA 
from the pD2/IC-20 clone or two of its 3 ! -end 
transcription products (clone linearized at the Nsil or 
Mrol site instead of at the 3 1 -terminal Xbal site) . 

25 pVE/IC-92 plasmid linearized at the Mlul (3 » -terminal) , 
SphI, Tthllll, Hindlll, Sail, and StuI sites in the cDNA 
clone yielded RNA transcripts of 11447, 11377, 7541, 2407, 
1620, and 674 base length, respectively (the more intense, 
prominent bands in these gel lanes) . 
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Figure 24: Transcription of RNA from the DEN- 2 16681 
cDNA clone pD2/IC-20. (A) Transcription of RNA using 
different quantities of linearized plasmid template (a,b) . 
The cap analog m7G(5 ' )ppp(5 1 ) A was not included in the 
5 reaction. (B) Transcription of 5 '-capped RNA with 

inclusion of cap analog in the reaction. Transcription 
was accomplished with the Ampliscribe transcription kit 
from Epicentre Technologies. T7 pol = bacteriophage T7 
RNA polymerase. 

10 Figure 25s Transcription of full genome -length, 

infectious viral mRNA from Xbal- linearized DEN- 2 16681 
plasmid pD2/IC-30P (A and D replicate clones resulting 
from independent bacterial colonies transformed with the 
recombinant pBRUC/DEN-2 plasmid) and PDK-53 plasmid 

15 pD2/IC-130V (F and J replicates) . Genomic "viral RNA" 
extracted from gradient -purified wild-type DEN- 2 16681 
virus was electrophoresed in lanes 2 and 10. Aliquots of 
transcription reactions sampled before (T7 RNA polymerase 
"-") and after (T7 Pol " + ") addition of T7 RNA polymerase 

20 are shown. Only the linearized plasmid DNA template is 
observed in the absence of the polymerase. 

Figure 26s Transcription of RNA from pD2/IC-20, 
pD2/IC-30P, and pD2/lC-130V in the presence or absence of 
T7 RNA polymerase or cap analog in the transcription 

25 reaction. All lanes shown are on a single gel. 
Transcription was performed with the Ampliscribe 
transcription kit. 

Figure 27 s Derivation tree for the construction of 
the DEN- 2 16681 and PDK-53 virus-specific full genome- 

30 length cDNA clones pD2/IC-30P and pD2/IC-130V, 
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respectively, and chimeric 16681/PDK-53 clones derived 
from the two prototype clones. 

Figure 28: Genotype maps of DEN- 2 16681 and PDK-53 
virus-specific full genome-length cDNAs and their chimeric 
5 derivatives. The scale at the top indicates relative 
genome nucleotide position in thousands. The graph 
resolution is 119.1444 bp/dot. cDNA regions contributed 
by the parental DEN-2 16681 virus are indicated by solid 
black bars. Regions derived from the DEN-2 PDK-53 vaccine 

10 virus are indicated by stippled bars. The 8 mutations 
identified by sequence analyses of the genomes of the 
16681 and PDK-53 viruses are indicated. The virus- 
specific 5-noncoding nucleotides are indicated in lower 
case characters. The amino acids encoded by the virus - 

15 specific nucleotide mutations in the protein coding region 
of the genome are indicated in upper case, single -letter 
amino acid abbreviation. 

Figure 29: Results of spot -sequencing PCR amplicons 
amplified from seed stocks of viruses derived from full 

20 genome-length cDNA clones. Dots indicate nucleotide 

sequence identity to the DEN-2 16681 virus. The expected 
virus-specific nucleotides for the genotype of each virus 
are shown. Those nucleotide positions that have actually 
been confirmed by sequence analysis are indicated by 

25 underlined nucleotide base characters. The actual genome 
nucleotide positions are indicated at the bottom of the 
Figure . 

Figure 30: Recombinant full-length pD2/lC-30P-A and 
pD2/IC-130V-F plasmids extracted from 1-ml aliquots of E. 
30 coli TB-1 cultures submitted to ATCC. 
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Figure 31: Partial nucleotide sequences of candidate 
vaccine viruses: 

DEN-1 16007 PDK-13 (Dl.VAC) (SEQ ID NO: 124) 

DEN- 2 16681 PDK-53 (D2.VAC) ( see SEQ ID NO: 2} 

5 DEN- 3 16562 PGMK-30/FRhL-3 (D3.VAC) (SEQ ID NO: 125) 

DEN- 4 1036 PDK-48 (D4.VAC) (SEQ ID NO: 126) 

aligned with the nucleotide and deduced amino acid 
sequences of DEN- 2 16681 virus ( see SEQ ID NO:l) . Dots in 
the DEN-1, DEN- 3, and DEN- 4 sequences signify identity 
10 with the DEN- 2 sequence. 

Figure 32: Partial amino acid sequences of candidate 
vaccine viruses: 

DEN-1 16007 PDK-13 (Dl.VAC) (SEQ ID NO: 124) 

DEN- 2 16681 PDK-53 (D2.VAC) SEQ ID NO: 2) 

15 DEN- 3 16562 P(MC-30/FRhL-3 (D3 .VAC) (SEQ ID NO: 125) 

DEN- 4 1036 PDK-48 (D4.VAC) (SEQ ID NO: 126) 

aligned with the deduced amino acid sequence of DEN- 2 
16681 virus ( see SEQ ID NO:l) . Dots in the DEN-1, DEN- 3 , 
and DEN-4 sequences signify identity with the DEN- 2 

20 sequence. 

Figure 33: Mutagenesis analysis of the 5' end of the 
prM gene. The 447-452 sequence ("AACCAC" in DEN- 2) can be 
mutated to "CTCGAG n in all four DEN viruses to create a 
Xhol site for cassette splicing. This modification 

25 results in conservative Thr-Thr to Ser-Ser substitutions 
at amino acid positions prM 4-5 in DEN -2 virus. By 
creating this Xhol site, all four viruses will contain the 
sequence FHLSSR at amino acid positions prM 1-6 ( see 
Figure 32) . Nucleotide mutations that are necessary to 

30 create the Xhol site are indicated by bold, underlined 
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characters in the nucleotide sequences of D2.VAC, Dl.VAC, 
D3.VAC, and D4.VAC and their respective primers designed 
for amplification in PCR. 

Figure 34: Mutagenesis analysis of the 3 1 end of the 
5 E gene. The 2344-2349 sequence ("TCACGC" in DEN- 2) can be 
mutated to "TCTAGA" in all four DEN viruses to create a 
Xbal site for cassette splicing. This modification 
results in no amino acid change in DEN- 2 at this site, but 
substitutions do occur in the other three viruses. By 

10 creating this Xhol site, all four viruses will contain the 
sequence SRS at amino acid positions E 470-472 ( see Figure 
32) . Nucleotide mutations that are necessary to create 
the Xbal site are indicated by bold, underlined characters 
in the nucleotide sequences of D2.VAC, Dl.VAC, D3.VAC, and 

15 D4.VAC and their respective primers designed for 
amplification in PCR. 

Figure 35: Construction of DEN -2 PDK-53 cassette 
plasmids pFl-Xho and pF2-Xba. (A) pFl-Xho: Clone PCR cDNA 
amplicons Fl-prM5' and Fl-prM3 1 into TA-vector. Sequence 

20 and splice correct clones together at the SphI site in the 
TA-vector to* construct pFl-prM53 (not shown) . Subclone 
the prM53 cDNA into Sstl/Sphl-cut pFl-E (£££ Figure 20) to 
construct pFl-Xho. (B) pF2-Xba: Clone PCR cDNA amplicons 
F2-E5' and F2-E3' into TA-vector. Splice correct clones 

25 together at the Xbal site in the TA-vector to construct 
pF2-E53 (not shown) . Subclone the Sphl/Hpal E53 cDNA 
fragment into pF2-16B ( see Figure 21) , which itself is 
subcloned into pBRUC-139 between the Sphl/Kpnl sites (not 
shown), to construct pF2-Xho. PCR amplimer designations 

30 are underlined. Solid black bars indicate newly 
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synthesized and sequence -characterized cDNA. Stippled bar 
indicates previously synthesized cDNA . Graph resolution = 
64.1857 nucleot ides/dot . 

Figure 36: Construction of chimeric plasmids 

5 containing the prM and E genes (Xhol-Xbal cDNA fragment) 
of DEN-1, DEN- 3 , or DEN- 4 candidate vaccine virus within 
the genetic background of DEN- 2 PDK-53 virus. pD2V-CAS12 
was constructed by ligating the Sstl/SphI fragment of pFl- 
Xho and Sphl/Kpnl fragment of pF2-Xba ( see Figure 33) into 

0 a truncated form of pD2/IC-130V ( see Figure 21) . pD2/IC- 
130V was truncated by restricting the full-length clone at 
the NsiI-4696 and 3 1 -end Xbal sites, blunt-ending with T4 

♦ 

DNA polymerase, and religating. This procedure removed 
genome nucleotides 4696-10723, thereby removing the Xhol- 

5 5426 and 3 1 -end Xbal sites, which would otherwise 

interfere with construction of chimeric plasmid cassettes 
using Xhol and Xbal sites. The cassette strategy employs 
PCR amplif ication of DEN-1, DEN- 3 , and DEN- 4 cDNAs 
containing the prM and E genes; cutting the amplicons with 

0 Xhol/Xbal; cloning resulting fragments into pD2V-CAS12 to 
construct pDlV-CAS12, pD3V-CAS12, and pD4V-CAS12 chimeric 
cassettes; confirming the chimeric Xhol /Xbal insert by 
nucleotide sequence analysis; and then subcloning the 
Sstl/Kpnl fragment of the chimeric cassette into pD2/IC- 

5 130V to construct the chimeric full genome-length cDNA 

clones from which chimeric DEN-2/1, -2/3, and -2/4 viruses 
are derived. The genetic background of DEN- 2 PDK-53 virus 
is illustrated by the solid black bars. The heterologous 
DEN-1, DEN-3, and DEN-4 cDNA inserts are indicated by the 

0 stippled bars. The pBRUC-139 plasmid backbone is not 
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illustrated for pDlV-CAS12, pD3V-CAS12, or pD4V-CAS12 
chimeric plasmid. Resolution = 110.5464 bp/dot. 

Detailed Description of the Invention 
5 We developed a quadravalent vaccine by initially 

constructing a full genome -length infectious cDNA clone 
for DEN- 2 virus. We chose serotype 2 of DEN virus because 
virus strains of this serotype generally replicate to high 
titer in cell culture. We chose to develop an infectious 

10 clone for the 16681 strain of DEN- 2 virus because the 

candidate vaccine viruses developed by Mahidol University 
are currently the best live, attenuated vaccine virus 
candidates in terms of immunogenic efficacy and lack of 
reactogenicity in vaccinees. We developed an infectious 

15 cDNA clone of the 16681 strain, which is the parent to the 
DEN- 2 PDK-53 candidate vaccine virus developed at Mahidol 
University, to permit engineering of second and later 
generation live, attenuated DEN vaccine viruses. 

The infectious clone strategy was initiated with the 

20 virulent parental 16681 strain obtained from the Division 
of Vector-Borne Infectious Diseases (DVBID) of the Centers 
for Disease Control and Prevention (CDC) virus collection. 
We synthesized cDNA from the DEN- 2 16681 viral RNA. The 
immediate objective was to obtain an accurate full genome - 

25 length infectious cDNA clone of the 16681 strain of DEN- 2 
virus, since it was essential to develop a reliable 
experimental system to permit routine genetic engineering 
of the cDNA and recovery of virus. Our approach involved 
using polymerase chain reaction (PCR) technology to create 

30 cDNA clones that could be spliced together to construct a 
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single full genome -length clone (Figure 1) from which 
full-length, infectious DEN- 2 genomic mRNA could be 
transcribed (Figure 2) . 

The first full-length sequence -characterized cDNA 
5 clone, designated pD2/IC-20, was constructed in the high 
copy number pUC19 plasmid vector. Successful 
transcription of genome- length DEN- 2 16681 viral RNA from 
pD2/IC-20 was clearly demonstrated by agarose gel 
electrophoresis of the transcription reaction product . 
10 However, RNA transcribed from this particular clone failed 
to yield infectious virus. It was determined that cDNA 
errors had occurred during the clone manipulations. We 
then decided to reconstruct the full-length clone in the 
low copy number pBR322 plasmid. The full-length cDNA of 
15 DEN- 2 16681 virus was successfully moved into pBR322 to 

construct pD2/IC-30P. Full-length, infectious DEN- 2 16681 
genomic RNA was subsequently transcribed from pD2/lC-30P. 

The DEN-1 PDK-13, DEN- 2 PDK-53, DEN- 3 PGMK-30/FRhL-3 , 
and DEN- 4 PDK-48 vaccine viruses were obtained from 
20 Mahidol University. Our goal involved replacement of the 
entire genomic cDNA backbone of the DEN- 2 16681 full- 
length clone with the cognate cDNA cloned from the genome 
of the DEN- 2 PDK-53 candidate vaccine virus. The prM and 
E genes of the DEN- 2 PDK-53 virus are then replaced with 
25 the prM and E genes of the DEN-1 PDK-13, DEN- 3 

PGMK30/FRhL-3, and DEN- 4 PDK-48 candidate vaccine viruses 
to construct chimeric DEN-2/1, DEN-2/3, and DEN-2/4 
viruses containing the nonstructural genes of the DEN- 2 
PDK-53 virus and the prM and E genes of the heterologous 
30 DEN viruses. 
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DEN-2 PDK-53 Infectious cDNA Clone Backbone 

TT 



NS1 



2A 2B 



NS3 



4A 



NS5 



3'-NC 



10 



15 



20 



25 



30 



prM 



prM 



DEN-1 PDK-13 



DEN-3 PGMK30/FRhL-3 



DEN-4 PDK-48 



It is contemplated that chimeric, infectious clone- 
derived DEN-2/1, DEN-2/3, and DEN-2/4 viruses will result 
in immediate improvement in the efficacy of a quadravalent 
vaccine. Our preliminary data from Mahidol University 
indicate that very small amounts of the DEN-2 PDK-53 
vaccine virus were required to infect and immunize humans. 
However, the DEN-1, DEN- 3, and DEN-4 vaccine virus 
candidates had approximately 30-fold to 2000-fold lower 
infectivity for humans. The low infective efficacies of 
the DEN-1, DEN- 3, and DEN-4 viruses create significant 
problems in terms of vaccine efficacy in eliciting 
seroconversion in vaccinees, as well as problems of 
vaccine production for mass vaccination programs, since a 
large volume, up to 1 ml, of undiluted cell culture- 
derived vaccine virus must be administered to achieve even 
minimal levels of infectivity for these viruses. Since 
the increased infectivity of the DEN-2 PDK-53 vaccine 
virus is likely due to more efficient virus replication, 
and since this replicative efficacy is controlled by the 
nonstructural proteins of the virus, then chimeric vaccine 
viruses that express the relevant immunogenic structural 
proteins of DEN-1, DEN- 3 , or DEN-4 virus in the context of 
replication control by the nonstructural gene products of 
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the DEN- 2 PDK-53 virus should replicate better and be more 
infective and immunogenic in human vaccinees than the 
original DEN-1, DEN- 3, and DEN- 4 vaccine viruses 
containing nonchimeric genotypes. 

5 

A quadravalent vaccine is obtained upon completion of 
the following steps: 

(1) A full genome-length infectious cDNA clone for a 
10 DEN- 2 virus, strain 16681, is constructed. 

(2) A full genome- length infectious cDNA clone for a 
DEN2-16681 attenuated derivative, PDK-53, is 
constructed, preferably by substituting the 

15 genomic cDNA backbone of the DEN2-16681 full 

length clone with the corresponding cDNA cloned 
from the genome of the DEN- 2 PDK-53 candidate 
vaccine virus. 

20 (3) The candidate DEN-1, DEN- 3 , and DEN- 4 vaccine 

viruses are subjected to PCR amplification of 
cDNA from extracted genomic RNA, and chimeric 
infectious cDNA clones expressing the prM and E 
genes of DEN-1, DEN- 3, and DEN- 4 viruses, 

25 respectively, in the context of the 

nonstructural genes of the DEN- 2 PDK-53 virus 
are constructed. 
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(4) The infectious clone-derived chimeric DEN- 2/1, 
DEN-2/3, and DEN-2/4 vaccine viruses are tested 
to ensure that they: 



(a) Are viable; 



(b) Express appropriate virus -specific 



xmmunogens ; 



(c) Replicate to sufficient titer in cell 



culture; 



(d) Are infectious and immunogenic for humans; 



and 



(e) Retain phenotypic markers of attenuation. 



There is no good animal model for investigating 
15 dengue pathogenesis. DEN viruses are naturally 

transmitted between mosquitos and humans. Although lower 
primates can be infected with these viruses, they do not 
develop the clinical profiles that occur in humans. 
Infectious clone-derived viruses can be compared to their 
20 more virulent parental strains using certain in vitro and 
in vivo markers: 



In Vitro Markers: 



25 Plaque size in cell culture ; 

Temperature sensitivity; 

Cytopathic effects (CPE) in LLC-MK 2 cells; and 
Replication in macrophages. 
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In Vivo Markers : 
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Virulence by intracranial route in mice; 
Viremia in monkeys; 
5 Virulence by intracranial route in monkeys; and 

Elicitation of neutralizing antibodies in 

animals . 

Infectious cDNA clones are expressed, the resulting 
10 RNA transcripts are transfected into permissible cells, 
and the live, attenuated viruses are formulated into 
vaccines . 

Additionally, the DEN- 2 PDK-53 and chimeric DEN- 
2/1, DEN-2/3, and DEN-2/4 infectious cDNA clones can by 

15 themselves confer immunity by DNA immunization, a form of 
gene therapy involving the direct inoculation of naked DNA 
into the host such that its expression produces an immune 
response (e.g., Ulmer et al., 1993 (DNA immunization 
protected against influenza); Cox et al., 1993 (DNA 

20 immunization protected against herpesvirus); Xiang et al . , 
1994 (DNA immunization protected against rabies) ; Sedegah 
et al . , 1994 (DNA immunization protected against 
malaria) ) . 

Moreover, infectious cDNA clones are exquisite tools 
25 for studying the molecular biology of virus structure, 
function, and replication. This has been amply 
demonstrated for many RNA viruses in the literature, 
including Venezuelan equine encephalitis virus as reported 
by Kinney et al. (1989). A successful infectious cDNA 
30 clone of DEN-2 virus permits important investigations of 
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dengue virus replication, pathogenesis, and antigenic 
structure. Infectious clone cDNA templates permit the 
directed engineering of virus vaccines. Directed site- 
specific, nonrandom mutations can readily be made in 
5 infectious cDNA clones, and therefore in clone-derived 

viruses, using a wide variety of DNA modification enzymes, 
restriction endonucleases, and in vitro mutagenesis 
methods. DNA is easier to manipulate than RNA, and the 10" 
9 error rate of DNA replication is much lower than the 10' 3 
10 - 10' 4 error rate produced by RNA polymerases. Infectious 
cDNA clones permit direct analyses of the phenotypic 
effects of individual and cumulative mutations in the 
viral genome. An infectious cDNA clone provides a "gold 
standard" reference sequence for a vaccine. 

15 

Particular aspects of the invention may be more 
readily understood by reference to the following examples, 
which are intended to exemplify the invention, without 
limiting its scope to the particular exemplified 
2 0 embodiment s . 
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EXAMPLES 



5 



Most of the background, protocols, and recipes used 



in recombinant DNA work can be found in Molecular Cloning: 
A Laboratory Manual (Sambrook et al., 1989), and Current 



10 

Viruses : 

The virulent parental DEN- 2 16681 strain was 
immediately available in the DVBID collection of viruses. 

15 We received the DEN-1 PDK-13, DEN -2 PDK-53, DEN- 3 PGMK- 
30/FRhL-3, and DEN-4 PDK-48 vaccine viruses from Mahidol 
University. The DEN vaccine viruses were passaged in 
primary dog kidney (PDK) cells because this cell culture 
is included among those cell types that are certified for 

20 human use by the Bureau of Biologies, US Food and Drug 
Administration (Yoksan et al., 1986). The virus strain 
designations are shown below: 



Protocols in Molecular Biology (Ausubel et al 



1989) . 



Vaccine 



25 



Parent 



Derivative 



Virus 



Strain 



strain 



30 



DEN-1 
DEN- 2 
DEN- 3 



16007 
16681 
16562 



PDK- 13 
PDK-53 

PGMK-30/FRhL-3 
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DEN- 4 1036 PDK-48 

PDK = primary dog kidney cells 
FRhL = fetal rhesus lung cells 
5 PGMK = primary green monkey kidney cells 



15 



16QQ7 Parent 



► Recovered from serum of a patient with hemorrhagic fever 

and shock in Thailand in 1964 
10 * Passaged 3X in BS-C-1 cells, IX in LLC-MK 2 cells 

► Passaged 2X in Toxorhynchites amboinensis mosquitos 

► PDK-1 



PDK-43 Vaccine 



PEff-2 Pare n t 



* Recovered from serum of a patient with hemorrhagic fever 

and shock in Thailand in 1964 
► Passaged 3X in BS-C-1 cells, IX in LLC-MK 2 cells 
20 > Passaged 2X in Toxorhynchites amboinensis mosquitos 
- PDK-1 



PDK- 53 Vaccine 



25 DEN- 3 16562 Parent 



, * Recovered from serum of a patient with hemorrhagic fever 
and shock in the Philippines in 1964 

* Passaged 3X in BS-C-1 cells, IX in LLC-MK 2 cells 

* Passaged 2X in Toxorhynchites amboinensis mosquitos 
30 * PGMK-1 
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PGMK-30 DEN- 3 virus grown in PGMK cells 

1 replicated to very low titer in 

PDK FRhL-3 Vaccine cells (Yoksan et al., 1986) 



PEN-4 1 93 S P arent 



* Recovered from serum of a patient with dengue fever in 

Indonesia in 1976 
► Passed 4X in Aedes aegypti mosquitos 
10 ► PDK-1 



PDK- 4 8 



The DEN- 2 full-length cDNA clone was derived from the 
15 DVB ID seed of DEN- 2 16681 virus, which had the passage 
history: 



Human 

3X BS-C-1 cells 
20 2X LLC-MK 2 cells 

2X T. amboinensis mosquitos 

4X C6/36 cells (Aedes albopictus) 

Complementary DNA (cDNA) was amplified by RT/PCR 
25 directly, without further cell culture passage, from virus 
present in vaccine vials of the DEN-1 PDK-43, DEN- 2 PDK- 
53, DEN- 3 PGMK-30/FRhL-3 , and DEN- 4 PDK-48 viruses. 

Stock virus seed was prepared from virus -infected 
cells grown in 75 or 150 cm 2 plastic tissue culture flasks. 
30 The culture medium was clarified by centrifugation for 30 
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min at 10,000 rpm in a Sorvall GSA rotor, bringing the 
final concentration of fetal bovine serum (FBS) to 10% 
(v/v) , and then freezing the clarified virus suspension in 
aliquots of 0.5 - 1.0 ml at -70°C. Gradient purified DEN- 
5 2 16681 virus was prepared according to the method of 
Obijeski et al. (1976) as reported by Kinney et al. 
(1983) . 

Cell fringg; 

10 

Infectious virus was derived from the infectious cDNA 
clones by electroporation of BHK-21-15 (baby hamster 
kidney-21, clone 15) cells with transcribed viral RNA. 
Viruses were also grown in LLC-MK 2 monkey kidney cells, 

15 Vero African green monkey kidney cells, and C6/36 mosquito 
cells (Aedes alJbopictus C6 cells, clone 36, Igarashi 
(1978)). All four cell lines were grown in Eagle's 
minimal essential medium (MEM) supplemented with 10% (v/v) 
heat -inactivated (56°C for 30 min) FBS, 1.25 g/L of sodium 

20 bicarbonate, 100 units/ml of penicillin G, and 100 /ig/ml 
of streptomycin sulfate. Confluent cell monolayers grown 
in plastic tissue culture flasks were infected by 
decanting the growth medium, permitting the virus inoculum 
to adsorb for 1.5 h at 37°C, and then adding MEM 

25 containing 5% FBS. For plaque titration of viruses, 

confluent cell monolayers in plastic 6-well trays were 
inoculated with 200 /il of the appropriate dilution of 
virus. Virus was adsorbed to the cell monolayer for 1.5 h 
at 37 °C. The cells were then overlaid with 3 ml of 1% 

30 (w/v) Noble agar (maintained at 40°C) in MEM lacking 
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phenol red pH indicator and containing 2% FBS and 0.01% 
(w/v) DEAE-dextran. Following incubation for 6 days at 37 
°C in a 5% C0 2 atmosphere, a second 1-ml agar overlay 
containing 50 fig/ml of neutral red vital stain was added. 
5 Viral plaques were counted 2-5 days later. 

The E. coli K-12 strains used in this project 
10 included XLl-Blue, MC-1061, SURE, JM101, and TB-1. 

Recombinant plasmid containing full genome-length cDNA of 
DEN- 2 virus was successfully replicated in E. coli XLl- 
Blue, MC-1061, and TB-1. Flavivirus cDNA, particularly 
the gene region encoding the envelope glycoprotein, is 
15 troublesome in E. coli. Bacteria hosting the recombinant 
plasmid containing the full-length cDNA clone grew slowly 
and were often difficult to streak for isolation on agar 
plates containing selective antibiotic. Transformation 
efficiencies were sometimes improved somewhat by 
20 incubation of agar plates at 30°C or ambient temperature 
rather than at 37°C. Bacterial stocks were stored frozen 
at -70°C in 10% (v/v) glycerol. 

Precautions for W orking with RNA: 

25 

RNA is a fragile molecule that is very readily 
degraded by the many ubiquitous RNases present in the 
environment. Many of these RNases are resistant to 
treatment with detergents and heat, including autoclaving. 
30 All reagents and materials that contacted the viral RNA in 



WO 96/40933 PCT/US96/09209 

* 

35 

this project were RNase-free to avoid degradation of the 
viral RNA by these ubiquitous, very stable enzymes. The 
investigator wore tight-fitting gloves, maintained all 
reagents on ice, used a plastic tool to open the lids of 
5 microtubes, used individually packaged pipets, preferably 
plastic for aqueous solutions, disposable plasticware 
which is generally RNase-free before opening, and used 
H Por RNA Only" microtubes, Gilson micropipetors (P-10, P- 
20, P-100, P-200, P-1000) and tips with aerosol barriers. 

10 Use of recycled glassware was avoided. Weigh boats, 

magnetic stirrers, and pH meters were not used. Chemicals 
were weighed in sterile, RNase-free disposable plastic 50- 
ml centrifuge tubes, and solutions were adjusted to the 
appropriate pH by aliquoting a small volume of the 

15 solution onto pH paper. Whenever possible, commercially 
prepared, guaranteed RNase-free reagents were purchased. 
Otherwise, newly-opened chemicals were reserved "For RNA 
Only". Water and stock salt solutions, except for those 
containing Tris, were treated overnight with 0.1% (v/v) 

20 diethylpyrocarbonate (DEPC) to inactivate RNases via 
alkylation and then autoclaved for 20 min. It is 

m 

advisable to use the best sterile technique when working 
with RNA. 

25 Extracti on of Viral Genomic RNA from Virus Seed: 

Virus seeds containing at least 10 6 PFU (plaque 
forming units) /ml of virus are ideal for providing 
appropriate yields of RNA. Seed with virus titer of 10 4 or 
30 lower can be problematic in terms of yielding sufficient 
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RNA. For these low- titer seeds it is best to pool the 
yields of several extracted seed aliquots. 

RNA extraction involved the addition of 200 //l of 
cold RNA lysis buffer (4 M guanidine isothiocyanate, 25 mM 
5 sodium citrate, pH 7.0, 0.5% (w/v) sarkosyl, and 100 mM 
beta-mercaptoethanol) , and 30 ptl of 3 M sodium acetate, pH 
5.2, to an empty RNase-free 1.5 -ml microtube on ice. In a 
biosafety cabinet, 200 ^1 of DEN virus seed was added to 
the microtube and mixed vigorously for 30 sec with a 

10 mechanical mixer. The tube was centrifuged briefly to 
pellet the liquid; then 400 pi of cold phenol 
(commercially supplied by AMRESCO) equilibrated to pH 4.5 
and 80 a*1 of cold chloroform were added. The tube was 
mixed vigorously for 30 sec, placed on ice for 15 min, 

15 mixed again, then centrifuged for 1 min at maximum speed 
in a refrigerated microcentrifuge to separate the aqueous 
and organic phases. The top aqueous phase containing the 
extracted RNA was transferred to a fresh 1.5 -ml microtube 
on ice, 400 i*\ of cold isopropanol was added, and the tube 

20 was incubated for at least 1 h or overnight at -20°C. The 
RNA was precipitated by centrifugation for 10 min at 
maximum speed at 4°C. The supernatant was removed with a 
pipet rather than by decant at ion and rinsed with 500 ^1 of 
75% (v/v) ethanol. After spinning again for 10 min, the 

25 ethanol was removed with a pipet. The tube was 

centrifuged again briefly and the residual liquid was 
removed with a micropipet. The RNA pellet was air dried 
briefly, resuspended in 50 pi of cold RNase-free dH 2 0, and 
stored frozen. For seeds containing low virus titer, the 
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RNA pellets in 3-6 microtubes were pooled in a total 
volume of 50 //l. 



5 



RT/PCR Synthesis of Dengue Virus- Specif ie cDNA Fragments 



Full-length genomic mRNA was extracted directly from 
200 /xl of DEN virus seed. The standard reverse 
transcriptase/polymerase chain reaction (RT/PCR) was 
performed in a 100-/xl reaction solution containing 5-18 /xl 

10 of the extracted viral RNA, 1 /xl each of 100 /xM stock 
solutions (stored frozen in dH 2 0) of the upstream mRNA- 
sense primer- amplimer and downstream complementary- sense 
primer-amplimer, 10 /xl of 10X standard PGR buffer (500 mM 
KC1, 100 mM Tris-HCl, pH 8.5, 15 mM MgCl 2 and 0.1% (w/v) 

15 gelatin), 8.0 /xl of 2.5 mM dNTPS (2.5 mM each of dATP, 
dCTP, dGTP, and dTTP; Pharmacia - LKB ) , 0.5 /xl of 1 M 
dithiothreitol (DTT) , 0.5 /xl of RNase inhibitor (RNasin, 
40 U//xl, Boehringer- Mannheim) , 0.5 /xl of Taq DNA 
polymerase (5 U//xl, Perkin- Elmer ) , and 0.5 /xl of RAV-2 

20 reverse transcriptase (18 U//xl, Takara) . The reaction 
solution was made as two components: 
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10 



15 



20 



25 



30 



35 



40 



► PCR Reaction Mix: 



10.0 ill 

8.0 jil 

0.5 ill 

0.5 ill 

0.5 /xl 

0.5 Ml 

6P,Q Ml 

80.0 /il 



10X Standard PCR Buffer 
2 . 5 mM dNTPs 
1 M DTT 

RNasin (40 U/mD 

Taq DNA Polymerase (5 

U/mD 

RAV-2 RT (18 U//xl) 
RNase-Free dH.O 



► Template/Primer Mix 



18.0 ill 
1.0 ^1 

1-0 Ml 



Reaction Mix for 1 
reaction. Make more 
than needed for all 
reaction tiibes. Store 
excess at -70°C for 
reuse . 

DEN- 2 RNA Template 
100 m m Up-Amplimer 
100 m m Down-Amplimer 



► Reaction Solution 



20.0 m! 

80.0 Ml 
2Q.Q M3- 
100.0 Ml 



PCR Reaction Mix 

Template /Primer mx* 

In a thin-walled, 200- 
Ml microtube. 



The RT/PCR reactions in thin-wall 200 -m1 microtubes 
(Phenix Research Products) were incubated without oil 
overlay in a Perkin-Elmer Model 9600 thermocycler 
according to the following program: 



50 °C for 60 min = 



First strand cDNA synthesis 
by reverse transcriptase 



94 °C for 
50°C for 
72°C for 



4 min 
1 min 

5 min 



94°C for 30 sec 

55°C for 3 0 sec 
72°C for 5 min 

Delta +10 sec/cycle 



30 Cycles 



WO 96/40933 PCT/US96/09209 

39 

Following completion of the RT/PCR reactions, 5-/xl 
aliquots of each of the 100-/xl reactions were analyzed by 
agarose gel electrophoresis. The DNA bands in the agarose 
gel were stained in ethidium bromide (500 ng/ml) solution 
5 and visualized on an ultraviolet light box. Since 

extraneous non-target cDNA bands are often amplified in 
addition to the target cDNA molecules, the remaining 95 fil 
of each RT/PCR reaction was electrophoresed in a larger, 
preparative agarose gel, and the target cDNA was stained 
10 briefly, excised with a razor blade, and physically 
extracted from the agarose slice. 

High-Melt- Crush (HMC) Extraction of DNA f rom Agarose a 

15 An agarose gel slice containing DNA was placed in a 

1.5 -ml microtube and crushed thoroughly with a spatula or 
pestle. The volume of the crushed agarose was brought to 
400-500 m1 with TE buffer (10 mM Tris-HCl, pH 7.5, 
1 mM disodium EDTA) and 400 j*l of phenol (supplied by 

20 AMRESCO) , pH 8, was added. The agarose suspension was 

mixed vigorously using a mechanical mixer, frozen, thawed 

* 

and mixed, frozen, thawed and mixed, and then centrifuged 
for 10 min at maximum speed at 4°C. The top aqueous phase 
was transferred to a fresh microtube, extracted with 400 

25 /il of phenol: chloroform risoamyl alcohol (25:24:1) and 
centrifuged for 2 min. The top aqueous phase was 
transferred to a fresh tube and extracted with 700 pi of 
diethyl ether or chloroform. If chloroform was used, the 
top phase was again transferred to a fresh tube after a 

30 brief spin to separate phases. The DNA was precipitated 
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for at least 30 min at -70°C or overnight at -20°C 
following addition of 2.5 volumes (essentially filling the 
raicrotube) of 95% ethanol containing 300 mM ammonium 
acetate and 10 mM MgCl 2 . The DNA was pelleted at 4°C by 
5 centrifugation for 20 min at maximum speed. The liquid 
was decanted, and the DNA pellet was rinsed with 500 /*1 of 
75% ethanol, air-dried briefly, dissolved in 30 nl of TE 
buffer, and stored frozen or in the refrigerator. A 3-a*1 
aliquot of the extracted DNA was analyzed for purity and 
10 quantity by agarose gel electrophoresis. Generally, 

20-80% of the DNA loaded onto a gel can be recovered from 
the gel by this method. 

Agarose cfelg? 

15 

DNA was analyzed by electrophoresis in 1% (w/v) 
agarose gels run in TBE buffer (100 mM Tris-HCl, pH 8, 91 
mM boric acid, and 20 mM disodium EDTA) . DNA bands were 
visualized by staining the gel in water containing 500 

20 ng/ml of ethidium bromide and exposure to ultraviolet 

light. Gels used for analyzing RNA transcripts were made 
with RNase-free reagents. Ethidium bromide stain was 
incorporated in the gel and running buffer so that the RNA 
bands could be visualized immediately. To obtain gel- 

25 purified DNA fragments, DNA was electrophoresed in 0.7% 
(w/v) agarose gels made with genetic technology grade 
Seakem agarose (FMC) or with biotechnology grade agarose 
(3:1 high resolution blend, AMRESCO) . 
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Cloning of Dengu e Virus -Specific cDNA F ragments; 

Some DNA polymerases add an extra "A" nucleotide 
5 overhang at the 3 1 -end of synthesized DNA strands. The 
Taq DNA polymerase does this. To enable the cloning of 
DNA molecules synthesized using Taq DNA polymerase, TA- 
cloning vectors have been engineered (Marchuk et al., 
1991). These vectors generally have a single "T" overhang 

10 engineered at the 3 1 -terminus of EcoRV-cut, blunt-ended, 
linearized plasmid vector. The EcoRV site occurs within 
the multiple cloning site (MCS) of the plasmid. The MCS 
is a series of contiguous, unique restriction enzyme 
(RENZ) sites engineered into a vector plasmid to permit 

15 subcloning of exogenous DNA fragments following 

restriction with a variety of RENZs. The HMC-purified DEN 
cDNA amplicons were cloned into the 3900-bp pCRII 
(Invitrogen) , the 2887-bp pT7Blue (R) (pT7Blue, Novagen) , 
or the 3003-bp pGEM-5Zf (Promega) TA-vector plasmid. The 

20 RENZ sites available in the MCS region of these TA- 
vectors, as well as the RENZ sites of the MCS of the 
general purpose cloning plasmids, pUC18 and pUC19, used in 
this project are shown below. 
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RENZ SitP .s Present in the MCS of Several Cloning Vectors 



pUC18 



pUC19 



pT7Blue 



pCRTT 



pGEM-SZf 



10 



15 



20 



25 



- 




T7 


SP6 


T7 


EcoRI 


Hindi I I 


Hindi I I 


Nsil 


Apal 


SstI 


SphI 


BspMI 


Hindi I I 


Aatll 


Kpnl 


PstI 


SphI 


Kpnl 


SphI 


Smal 


Sail 


PstI 


SstI 


Ncol 


BamHI 


Xbal 


Sse8387I 


BamHI 


Sstll 


Xbal 


BamHI 


Sail 


Spel 


EcoRV 


Sail 


Smal 


AccI 


BstXI 


Spel 


PstI 


Kpnl 


Hindi 


EcoRI 


Not I 


SphI 


SstI 


Xbal 


EffpRV 


PstI 


Hindi I I 


EcoRI 


Spel 


EcoRI 


Sail 






Ndel 


PstI 


Ndel 






EcoRV 


BstXI 


SacI 






BamHI 


Not I 


BstXI 






Aval 


Aval 


Nsil 






Smal 


SphI 


SP6 






Kpnl 


Nsil 








Sac I 


Xbal 








Banll 


Apal 








EcoRI 


T7 





30 



The pUC18/19 plasmids possess identical MCS sites in 
reverse orientation in the plasmid backbone. Their 
purpose is to permit cloning of DNA in either orientation 
into the plasmid using the same pair of RENZs - this 
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reversibility was exploited in this project. The TA- 
vectors used here all possessed T7 and/or SP6 
bacteriophage RNA promoters to enable RNA. transcription 
from cloned DNA. These promoters were not used in this 
5 project. All of the plasmids contain the gene for 
ampicillin resistance. They also contained the lac Z 
portion of the E. coli lac operon. This permits color 
discrimination between bacterial colonies that receive a 
recombinant or a wild- type plasmid. In the presence of 

10 IPTG and X-gal, bacterial colonies that are transformed 
with a wild- type plasmid lacking a cDNA insert develop a 
blue color, whereas cells that receive a recombinant 
plasmid with cDNA cloned into the MCS of the plasmid are 
white. Agar plates contained 800 pg of IPTG and 800 fig of 

15 X-gal . 

Fifty to 100 ng of HMC-purified amplicon was ligated 
to 50 ng of the pCRII vector using the TA- vector cloning 
kit supplied by Invitrogen exactly as specified by the 
instructions supplied with the kit. Frozen, 

20 transformation competent IL. coli INVaF' cells, supplied 
with the Invitrogen kit and stored at -70°C, were 
transformed with the ligated DNA as described in the kit 
instructions. The transformed cells were plated on YTAgo 
agar plates (8 g of DIFCO tryptone, 5 g of DIFCO yeast 

25 extract, 5 g of NaCl, and 15 g of BACTO agar per liter of 
dH 2 0) containing 50 fig/ml of ampicillin. Only bacterial 
cells transformed with the pCRII plasmid, which contains 
an ampicillin resistance gene, grow on this medium. The 
agar plates were incubated at 37°C overnight. 
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Similarly, cDNA was ligated to the other TA-vectors 
or to pUC18/19 cut with the appropriate RENZ(s). 
Ligations were performed at room temperature or at 12°C. 
E. coli XLl-Blue, SURE, TB-1, or MC-1061 cells were 
5 transformed by electroporation and plated on YTAgo plates. 
Electroporation was performed according to Dower et al. 
(1988) using cuvettes with a 2 -cm electrode gap in a Bio- 
Rad Gene Pulser set at 2.5 kV voltage, 25 ptF capacitance, 
and 200 ohms resistance. Electroporation- competent cells 
10 were prepared by growing a fresh bacterial culture to an 
optical density of 0.5-0.7 at 600 nm. The cells from 1.5 
- 3 L of culture were pelleted by centrifugation for 10 
min at 4°C and 5000 rmp in a Sorvall GSA rotor, pooled, 
washed twice in 1 mM Hepes buffer, and resuspended in 2 ml 

# 

15 of 10% (v/v) sterile glycerol per L of original culture. 
The concentrated cells in glycerol were stored at -70°C. 

Bacterial colonies were transferred to 2 ml of 2XYT- 
Amp 50 broth (16 g of tryptone, 20 g of yeast extract, and 5 
g of NaCl per liter of dH 2 0) and incubated overnight with 

20 shaking at 300 rpm at 37°C in a floor model incubator - 
shaker (model Innova 4300, New Brunswick) . Recombinant 
plasmid was extracted from these 2 -ml minicultures and 
analyzed by agarose gel electrophoresis for the presence 
of cDNA insert. Recombinant plasmids are larger than wild 

25 type vector plasmid because of the cDNA insert, and they 
migrate more slowly than wild type plasmid in agarose 
gels. 

All of the DEN- 2 16681 virus-specific cDNA amplicons 
were cloned into the pCRII TA-vector. Aliquot s of insert - 
30 positive miniprep plasmids were digested with the 
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restriction enzyme EcoRI. Since the pCRII MCS contains 
two EcoRI recognition sites (palindromic hexameric 
sequence GAATTC) on either side of the EcoRV cDNA cloning 
site, this RENZ cleaved the cDNA insert from the plasmid 
5 vector and cleaved any EcoRI sites that were present 
within the cDNA itself. The EcoRI -restricted DNA was 
analyzed by agarose gel electrophoresis to determine that 
the cloned cDNA was of appropriate size. In our 
experience, cloning of PCR-derived cDNA amplicons 2000 bp 

10 or smaller in size into the TA-vector is efficient. 

Cloning amplicons larger than 3500 bp into the TA-vector 
can be very difficult. 

After screening, certain of the miniprep plasmids 
were selected for further analysis. Their corresponding 

15 bacterial minicultures were streaked for isolation on YTAs 0 
plates, and an isolated colony was inoculated into 
50-200 ml of YTAgo broth to grow up a preparative amount of 
recombinant plasmid. The preparative scale for the 
extraction of the plasmid was essentially identical to 

20 that for minipreps except for scaled up volumes. 

Extraction of Plasmi d DNA from Minicultures of B. coll ; 

White colonies containing recombinant plasmid were 
25 picked with a sterile toothpick and shaken overnight at 
300 rpm in 2 ml of 2X- YTAgo broth. Each mini culture was 
decanted into a 1.5-ml microtube, and the cells were 
pelleted by centrif ugation at 6000 rpm for 2 min. The 
supernatant was aspirated, and the cell pellet was 
30 resuspended gently by up/down micropipeting in 200 ^1 of 
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GTE buffer (50 mM glucose, 25 mM Tris-HCl, pH 8.0, and 25 
mM disodium EDTA) and then mixed with 300 a*1 of lysis 
buffer (0.2 N NaOH, 1% (w/v) sodium dodecylsulfate (SDS) ) . 
After incubation on ice for 5 min, 3 00 /zl of cold 
5 potassium acetate solution (3 M potassium acetate, 7 M 
acetic acid, pH 4.8) was added, and the solution was 
chilled for 5 min on ice and then centrifuged at maximum 
speed for 10 rain at 4°C. The supernatant was poured into 
a fresh microtube, RNase A was added to 20 j/g/ml, and the 

10 mixture was incubated at 37°C for 30 min. The sample was 
extracted twice with 600 /*1 of chloroform and centrifuged 
for 1 min at maximum speed at room temperature. The DNA 
pellet was dissolved in 32 pel of dH 2 0. Eight /zl of 4M NaCl 
and 40 ixl of 13% (w/v) PEG-8000 was added, and the mixed 

15 solution was incubated for 5 min on ice. The sample was 
centrifuged for 15 min at maximum speed at 4°C, the liquid 
was aspirated with a micropipet , and the pellet was 
rinsed with 500 fxl of 75% ethanol. The air dried pellet 
was dissolved in 30 /il of dH 2 0 and stored frozen until 

20 used. 

Extraction of Plaemid DNA from Large Cultures of E. coliz 

Preparative -scale plasmid extraction was performed by 
25 inoculating 100 ml of 2X-YTA5 0 broth with 2 ml of an 
overnight culture of E. coli. The culture was shaken 
overnight at 300 rpm and 37°C. The cells were pelleted by 
centrifugation for 10 min at 5000 rpm in a Sorvall GSA 
rotor and resuspended in 6 ml of cold GTE buffer. Nine ml 
30 of a freshly made solution of 0.2 N NaOH and 1% (w/v) SDS 
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was added. The sample was incubated for 5 min on ice, 
then 9 ml of cold 3 M potassium acetate solution was 
added. After another 5-min incubation on ice, the tube 
was centrifuged for 20 min at 10,000 rpm at room 
5 temperature and the supernatant was transferred to a fresh 
30-ml glass tube. RNase A was added to 20 Mg/ml, and the 
sample was incubated for 30 min at 37°C and then extracted 
twice with 6 ml of chloroform. Twelve ml of room- 
temperature isopropanol was added and the tube was 

10 centrifuged immediately for 20 min at 10,000 rpm at room 
temperature. The supernatant was decanted, and the DNA 
pellet was rinsed with 1 ml of 75% ethanol, air dried 
briefly, and resuspended in 480 ^1 of dH 2 0. The DNA was 
precipitated by addition of 120 /zl of 4 M NaCl and 600 pi 

15 of 13% PEG-8000, incubation for 5 min on ice, and 

centrifugation for 15 min at maximum speed at 4°C. The 
DNA pellet was rinsed with 500 ptl of 75% ethanol, air 
dried briefly, rehydrated in TE buffer, and stored frozen. 

20 Nucleotide Sequence Analysis of the Dengue cDNA Clones: 

Nucleotide sequence analyses of DEN- 2 16681 cDNA 
clones #1-#15 were performed by cloning EcoRI restriction 
fragments of each clone into the single -stranded 

25 bacteriophage M13mpl8 or M13mpl9. Since this is not the 
current method of choice for sequencing, the method will 
be described only briefly here. The procedure used for 
the extraction of plasmid DNA from bacterial cells was 
also used to extract the intracellular double -stranded 

30 replicative form (RF) DNA of M13 from bacteriophage- 
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infected E. coli JM101 cells. The RF DNA was linearized 
at the EcoRI site of the MCS and ligated to the DEN -2 HMC- 
purified EcoRI cDNA restriction fragments. 
Electroporation-competent E. coli JM101 cells were 
5 transformed by electroporation and plated onto H-agar 

plates (10 g of DIFCO tryptone, 5 g of NaCl, 15 g of BACTO 
agar, and 1% (w/v) thiamine per liter of dH 2 0) containing 
800 /xg each of isopropyl-fJ-D-galactopyranoside (IPTG) and 
5-bromo-4-chloro-3-indolyl-0-D-galactopyranoside (BCIG or 

10 X-gal) . The electroporated cells were mixed with 300 fil 
of a fresh logarithmic culture of JM101 cells and 3 ml of 
warm (51°C) top H-agar containing 9 g/L of agar and then 
poured onto the H-agar plates. Cells that were 
transfected with recombinant DNA supported replication of 

15 recombinant M13 virus, resulting in the formation of 

bacteriophage plaques in the JM101 cell lawn on the agar 
plate. The IPTG/BCIG histochemistry of the system 
permitted identification of white plaques containing 
recombinant bacteriophage into which cDNA had been ligated 

20 into the EcoRI site of the MCS, whereas wild- type 

nonrecombinant M13 bacteriophage produced blue plaques. 
Isolated plaques were picked, inoculated into 3 ml of a 
fresh, pre- logarithmic phase culture of JM101, and shaken 
at 37°C for 8-16 h. The minicultures were clarified by 

25 cent rifugat ion in 1.5-ml microtubes, the bacteriophage 

particles were precipitated with PEG-8000, and the single- 
stranded, circular bacteriophage DNA was isolated from the 
virions by phenol extraction. The recombinant, circular, 
single -stranded bacteriophage DNA was sequenced by the 

30 dideoxynucleotide termination method. Sequencing kits can 
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be purchased from various commercial vendors. Radioactive 
32 P-dCTP or 35 S-dCTP was incorporated into the strands 
synthesized in the sequencing reactions. Sequencing was 
accomplished with many DEN- 2 virus-specific primers 
5 designed to sequence the entire genome. The sequence 

reactions were electro-phoresed in 6% (w/v) polyacrylamide 
gels, which were dried onto filter paper and overlaid with 
X-ray film. The DNA bands of the autoradiographs were 
read by the investigator, and the data was entered into a 
10 sequence project data spreadsheet. This sequencing method 

♦ 

has been used extensively in the past (e.g., Kinney et 
al. f 1986; Johnson et al., 1986; Deubel et al., 1986; 
Deubel et al. # 1988; Kinney et al., 1989; Trent et al., 
1987) . 

15 Nucleotide sequencing was also performed by the 

current method of direct sequencing of double- stranded 
plasmid DNA by the dideoxynucleotide termination method 
using the Applied Biosystems Taq DyeDeoxy Terminator Cycle 
Sequencing Kit, cycle sequencing in the Model 9600 

20 thermocycler according to the instruction manual supplied 
with the kit, and analyzing the DNA sequence on an ABI 
Model 373A DNA sequencing apparatus. Sequencing reactions 
in 200-^1 thin-walled microtubes contained 9.5 pi of 
reaction mix (buffer, the four dideoxynuleotides, and Taq 

25 polymerase supplied in the kit), 7.0 ^1 of double or 

single -stranded template DNA (150 pg/bp) , and 3.2 jul of 
10 /xM sequencing primer (32 pmol) . After mixing, the 
reactions were placed in a Perkin- Elmer Model 9600 
thermocycler, and programmed cycle sequencing was 

30 performed for 25 cycles of incubation at 96°C for 15 sec, 
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50°C for 15 sec, and 60°C for 4 min. Strand extension was 
performed at 60°C rather than 72°C because the fluorescent 
dye -labeled dideoxynucleotide terminators are heat 
sensitive- The reaction was then applied to a Centrisep 
5 gel column (Princeton Separations) to remove 

unincorporated dye-labeled dideoxynucleotides according to 
the instructions supplied with the columns. The eluted 
DNA was vacuum dried for 1 h using a Savant Speed Vac 
Concentrator and stored at -70°C. The DNA was hydrated 

10 with 5 m1 of deionized formamide and 1 t*l of 50 mM 

disodium EDTA, then heated in an aluminum block for 2 min 
at 90°C. A 3-/zl aliquot of the denatured DNA sample was 
applied to one of 24 wells of a polyacrylamide-urea gel in 
an Applied Biosys terns 373A DNA sequencer. The color- coded 

15 sequence chromatograph was read by visual inspection, and 
the resulting nucleotide sequence was entered into a 
computer-maintained sequence data spreadsheet* The 
sequencing kit incorporates dideoxynucleotide terminators 
that are each labeled with a unique fluorescent dye that 

20 permits laser detection of all four terminators in a 

single polyacrylamide gel lane in the Model 373 sequencer. 
The data was recorded in the form of colored chromatograms 
that are easily read by the investigator. Single -stranded 
recombinant M13 DNA can also be sequenced in this manner. 

25 

Extraction of M13 Single -Stranded DNA for Sequencing; 



White bacteriophage plaques containing recombinant 
M13 DNA were picked with sterile toothpicks and placed 
30 into 2-ml slightly turbid (less than 0.15 A^) cultures of 
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E. coli JM101. The cultures were shaken at 300 rmp and 
37°C overnight and then clarified by centrifugation in 
microtubes at maximum speed for 10 min at room 
temperature. One ml of the supernatant was transferred to 
5 a fresh 1.5-ml microtube containing 200 pi of sterile 20% 
(w/v) PEG-8000 in 250 mM NaCl. The tubes were mixed by 
inversion, incubated for 15 min at room temperature, and 
centrifuged at maximum speed for 5 min at room ■ 

♦ 

temperature. The PEG supernatant was removed completely, 
10 and the DNA pellet was resuspended in 300 pi of TE buffer. 
An equal volume of pH 8 -buffered phenol was added, and the 
solution was mixed vigorously several times during a 
period of 20 min at room -temper a tur e . The tube was 
centrifuged for 5 min at room temperature, and the top 
15 aqueous phase was transferred to a fresh 1.5 -ml microtube. 
After sequential extraction with phenol : chloroform : isoamyl 
alcohol and chloroform, the DNA was precipitated by adding 
2.5 volumes of 95% ethanol containing 300 mM ammonium 
acetate and 10 mM MgCl 2 and incubating at -20°C overnight. 
20 The tube was centrifuged at maximum speed for 15 min at 
4°C, and the supernatant was decanted. Following a rinse 
with 500 pi of 75% ethanol, the DNA was air dried briefly, 
resuspended in 60 pi of TE buffer, and stored at 4°C. • 



25 Primers ; 

Primer design was based on the sequence of DEN- 2 
virus, strain 16681, published by Blok et al. (1992), and 
DEN- 2 virus, Jamaican strain 1409, as reported by Deubel 
30 et al. (1986) and Deubel et al. (1988). 
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Primers were synthesized by the Biotechnology Core 
Facility at the CDC in Atlanta, Georgia. We received the 
dried primers via mail and adjusted them to a 
concentration of 100 ^ in dH 2 0. The designations and 
5 sequences of all of the primers -amplimers used in this 
project are listed in Appendix A. 

To amplify the 3 ! -end of the DEN- 2 virus genome, a 
downstream amplimer was designed that was complementary to 
the published sequence of the 3' terminus of the genome. 

10 A unique Xbal restriction enzyme site was incorporated at 
the 5 1 end of this amplimer to provide a unique site to 
permit linearization of the recombinant plasmid containing 
the full-length cDNA clone at the 3' terminus of the 
cloned genomic cDNA. This linearization was necessary to 

15 obtain appropriately terminated DEN virus-specific run-off 
RNA transcripts from the cDNA clone in transcription 
reactions with bacteriophage T7 RNA polymerase. 
Linearization at this 3 1 -terminal Xbal site resulted in 
the incorporation of a 5 -nucleotide TCTAG extension to the 

20 3» terminus of the genomic mRNA transcribed from the full- 
length cDNA clone of DEN- 2 16681 virus, and a 4 -nucleotide 
CTAG extension to the 3 1 terminus of RNA transcribed from 
the DEN- 2 PDK-53 cDNA clone. The difference between the 
two cDNA clones in the length of the extraneous 

25 3* -terminal extension was due to the differently designed 
3 '-terminal amplimers used to obtain the 3' end genomic 
cDNA amplicon. Amplimer CD2-10687 .XBA or CD2-10687.X2 was 
used to amplify and clone the 3 1 -terminal portion of DEN- 2 
16681 or PDK-53 virus, respectively. 
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The promoter for the bacteriophage T7 RNA polymerase 
was engineered at the 5' terminus of the cloned genomic 
cDNA by incorporating the recognition sequence of the T7 

0 

RNA polymerase into the sequence of the 5' -terminal 
5 upstream, mRNA-sense amplicon D2-SMT71 immediately 

preceding the 5' -terminal nucleotide of the DEN- 2 viral 
genome. This design ensured that the T7 RNA polymerase 
initiated RNA transcription at the 5' -terminal nucleotide 
of the DEN- 2 virus-specific cDNA (Milligan et a!., 1987). 

10 Amplimers for PCR reactions were designed to take 

advantage of RENZ sites identified within the nucleotide 
sequence of the genome of DEN- 2 16681 virus. cDNA 
molecules were amplified to permit ligation or splicing 
together of overlapping contiguous cDNA clones at shared, 

15 overlapping, unique RENZ sites (Figure 3) . 

Transcription of Genomic mRNA from DEN Virus -Specific 
Full -Length cDNA Clones; 

The recombinant plasmid containing the full-length 
cDNA clone was prepared for RNA transcription by 
linearization at the unique Xbal site located at the 3 1 
terminus of the cloned genomic cDNA. The restriction 
reaction containing the Xbal -restricted plasmid was 
extracted sequentially with phenol: chloroform risoamyl 
alcohol and chloroform and then precipitated. The DNA was 
redissolved in 50 txl of TE buffer and digested with 
proteinase K at a concentration of 1 mg/ml for 1 h at 37°C 
to hydrolyze contaminating RNases. The sample was then 
extracted twice with "For RNA Only" 

phenol: chloroform :isoamyl alcohol buffered to pH 8, 
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extracted twice with chloroform to remove traces of 
phenol, and precipitated by adding one -tenth volume of 
RNase-free 3 M sodium acetate, pH 5.2, and 2.5 volumes of 
ethanol and incubating for at least 1 h at -70°C or 
5 overnight at -20°C. 

DEN- 2 virus -specific genomic RNA was transcribed from 
the linearized cDNA template using a commercial T7 
transcription kit (Ampliscribe T7 transcription kit, 
Epicentre Technologies) . Transcription reactions were 

10 performed for 2 h at 37°C in RNase-free 1.5-ml microtubes 
in 20-//1 reactions containing 100-1000 ng of linearized 
DNA template, 7.5 mM each of CTP, GTP, and UTP, 0.75 mM 
ATP, 2.7 mM m 7 GpppA cap analog, 6.7 mM DTT, 2.0 /il of a 
10X concentration of a proprietary buffer supplied with 

15 the commercial kit, and 2.0 pi of the proprietary 
Ampliscribe enzyme solution supplied with the kit. 
Reaction solutions were used directly and without further 
treatment to transfect BHK-21 cells. 

20 Transfection of BHK-21 Cells with Genomic RNA Transcrip ts: 

BHK-21 clone 15 cells were transfected with RNA 
transcripts by electroporation (Liljestrom et al., 1991). 
Fresh cultures of BHK-21 cells were grown to 90% 
25 confluency, rinsed twice with cold RNase-free phosphate 
buffered saline (PBS) , and released from the plastic by 
incubation with 3 ml of commercial trypsin-EDTA solution 
(GIBCO-BRL) . The cells were pelleted by low- speed 
centrifugation at 1200 rpm for 5 min in a Beckman GPKR 
30 centrifuge. The cells were washed twice with cold PBS, 



WO 96/40933 PCT/US96/09209 

■ 

55 

resuspended in cold PBS and kept on ice. The cells were 
counted using a hemacytometer and microscope, and the cell 
concentration was adjusted to 10 7 cells/ml. One-half ml of 
the washed, adjusted cells were mixed with each 
5 transcription reaction solution in 1.5-ml microtubes on 
ice. The mixture was transferred to a cold 
electroporation cuvette with 0.2-cm electrode gap, which 
was placed in the cuvette holder of the Bio-Rad Gene 
Pulser. The cells were shocked twice using settings of 

10 1.5 kV voltage, 25 pFD of capacitance, and resistance set 
to infinity. The shocked cells were incubated for 10 min 
at room temperature and then added to 75 cm 2 tissue flasks 
containing 20 ml of MEM containing 10% FBS. Transfected 
cell cultures were incubated at 37°C for 5-8 days until 

15 CPE was evident in the cell monolayer and/or expression of 
DEN virus -specific antigens was identified in an aliquot 
of the cell monolayer scraped from the flask using DEN 
virus-specific mouse hyperimmune ascitic fluid or 
monoclonal antibodies in indirect immunofluorescence 

20 tests. 

RESULTS 

Replication of DEN- 2 16681 Virus: 

25 DEN- 2 16681 virus replicates to high titer in cell 

culture. The CDC virus seed used in this study contained 
2.0 X 10 7 plaque forming units (PFU)/ml. This titer was 
determined by plaque titration of the seed virus in 
monolayer cultures of Vero cells. This seed titered 1.3 X 

30 10 4 PFU/ml in LLC-MK 2 cells. A growth curve for this virus 
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was determined in C6/36 Aedes albopictus cell culture 
(Figure 4) . This level of replication is quite high for 
flavivirus. The DEN- 2 16681 virus is eminently suitable 
to serve as the parent to an infectious cDNA clone of DEN 
5 virus . 

The DEN- 2 PDK-53 vaccine virus, taken directly from 
vaccine vial obtained from Mahidol University, contained 
3.4 X 10 4 PFU/ml of virus, as titrated in Vero cell 
monolayers, and 1.5 X 10 4 PFU/ml as titrated in LLC-MK 2 
10 cell monolayers. 

RT/PCR Amplifi cation and Cloning of DEN- 2 16681 cDNA: 



The entire genome of DEN- 2 virus, parental strain 
15 16681, was amplified from genomic RNA in the form of 5 

cDNA clones of various sizes (T7-F1, F2, F3, F4, and F5) . 
PCR amplification with 5 sets of upstream and downstream 
amplimers yielded the predicted amplicon sizes in PCR 
reactions. Figure 5 shows the migration of these cDNA 
20 fragments in agarose gels. 

Recombinant plasmids, obtained by ligating the cDNA 
amplicons into the pCRII TA- vector, were extracted from 
minicultures derived from transformed E. coli XLl-Blue 
colonies. Uncut plasmids were screened for the presence 
25 of cDNA insert by comparing their mobility in agarose gel 
with the mobility of uncut wild-type pCRII vector plasmid 
Selected plasmids were then restricted with the 
restriction enzyme EcoRI to confirm the size of the 
inserted cDNA fragment. EcoRI digests of F2-Sal, Sal-F2, 
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and F3 plasmids derived from independent transformed 
bacterial colonies are shown in Figure 6. 

The following 15 DEN- 2 16681 virus-specific cDNA 
clones, shown schematically in Figure 7, were selected for 
nucleotide sequence analysis: 

RT/PCR 
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AA6-2 


15 


F5 




AA6-4 



25 

RT/PCK AtttoI i f ication and Clo ning of DEN- 2 PDK-53 cDNA; 

The entire genome of DEN- 2 virus, vaccine strain PDK- 
53 , was amplified from genomic RNA in the form of 23 cDNA 

30 clones of various sizes . Even though the PDK-53 vaccine 
contained only about 10 4 PFU/ml of virus, we were able to 
routinely amplify cDNA from RNA that was extracted 
directly from this seed virus. To accomplish this, we 
routinely use the "extended PCR method", incorporating the 

35 Taq extender reagent (Stratagene) in the PCR reactions. 
We had previously shown that the Taq extender 
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significantly enhanced yields of large molecular weight 
amplicons in the PCR amplification of the nonstructural 
genes of the flavivirus, St. Louis encephalitis virus 
(Figure 8A) . For extended PCR reactions, reaction 
5 mixtures were made as for standard PCR reactions, but the 
standard PCR buffer was replaced with the Taq extender 
buffer and 1 unit of AmpliTaq DNA polymerase (Perkin- 
Elmer) and 1 unit of the Taq extender enzyme per kbp of 
expected amplicon size was included in the reaction. 

10 Figure 8B shows the correct agarose gel migration of large 
cDNA amplicons Fl (containing the T7 RNA polymerase 
promoter at the 5' end of the mRNA-sense strand of the 
amplicon), F2, and F3 obtained by PCR amplification using 
DEN- 2 PDK-53 viral genomic RNA as template. The standard 

15 PCR reaction also worked for a number of DEN- 2 PDK-53 
amplifications . 

The PDK-53 PCR products were cloned into the pGEM-5Zf 
TA-vector (Promega) or the pT7Blue(R) TA- vector (Novagen) . 
Although we seemed to have the best cloning efficiency of 

20 PCR amplicons in the pCRII TA-vector, the other vector 
kits were less expensive and worked well. The cloning 
efficiency of PCR products into the TA-vector decreased 
rapidly as amplicon size increased beyond 2000 bp. 

25 The following 23 DEN- 2 PDK-53 virus- specif ic cDNA 

clones were selected for nucleotide sequence analysis: 
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30 Nucleotide Sequen ce Analyses of DEN- 2 16681 cDNA Clones; 

EcoRI fragments of the 15 DEN -2 16681 virus-specific 
cDNA clones were subcloned into the single -stranded 
bacteriophage M13mpl8 or M13mpl9 for sequencing. 
35 Sequencing of the entire viral genome was performed 
manually using radioisotopic labeling and exposure, 
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development, and reading of autoradiographs . The data was 
read from the films and entered by hand into a sequence 
data spreadsheet. 

The locations of observed cDNA artifacts or "errors" 
5 dictated the splicing strategy of subclones to construct 
the full genome- length clone* If the nucleotide at a 
particular position of one cDNA clone differed from the 
nucleotides at that same position in 2 or more independent 
clones, then the nucleotide in the first clone was deemed 

10 to be an error. If only 2 cDNA clones were sequenced for 
a given region of the genome and they differed in sequence 
at a particular position, then if one of the cDNA clones 
agreed with the sequence data of Blok et al. (1992), then 
the clone containing the nucleotide that was in agreement 

15 with the latter investigators was deemed to be correct. 

The approximate locations of the cDNA errors identified in 
the 16681 clones are illustrated in Figure 9. 

The full genome-length cDNA clone of DEN- 2 16681 
virus was first constructed in pUC19. Unfortunately, RNA 

20 transcribed from this clone was not infectious. When 
over 90% of the full-length cDNA in the clone was 
resequenced, it was determined that several mutations had 
occurred during splicing and cloning manipulations of the 
subclones in E. coli . One of these mutations was a base 

25 deletion in the NS4B gene. This deletion would cause a 
frameshift of the amino acid sequence, resulting in 
ribosomal translation of a nonsense polypeptide downstream 
of the mutation point. This fatal deletion, by itself, 
would explain the noninfectious nature of the RNA 

30 transcribed from the first full-length clone in pUC19. 
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The final, correct cDNA subclones (Fl-E, F2-E, F3/4/5-F) 
that were incorporated into the full -length, successfully- 
infectious clone of 16681 virus were reanalyzed by direct 
sequencing of the double -stranded plasmid DNA via the 
5 thermocycling method using the Taq DyeDeoxy Terminator 
Cycle Sequencing Kit. Sequence analysis was performed 
using the automated 373A DNA sequencing machine. The 
color- coded sequence chromatograms were read by the 
investigator and the data was entered manually into a 

10 computer-based spreadsheet. 

We independently confirmed the sequence of the 5 1 - 
terminal 32 nucleotides of the DEN- 2 16681 viral genome. 
A 5' -end RNA-cDNA hybrid molecule, made with primer cD2- 
996 and reverse transcriptase, was 3 1 -tailed with dCTP and 

15 annealed to dGTP-tailed, Pstl-cut M13mpl9 RF DNA. One of 
the resulting M13 clones had a cDNA run-off product 
containing the 5 1 -terminal end of the genome. The 5 1 -end 
sequence was identical to that published for DEN- 2 1409 
(Deubel et al., 1988) and DEN- 2 16681 (Blok et al., 1992). 

20 We have not independently confirmed the sequence of the 

3 1 -terminal 36 nucleotides of DEN- 2 16681 virus or the 5'- 
or 3' -terminal nucleotides of DEN -2 PDK-53 virus. 

We sequenced uncloned, PCR- derived amplicon cDNA 
fragments directly for the following regions of the DEN- 2 

25 16681 viral genome: nucleotides 70-260, 330-870, 890- 

1690, 1890-3720, 3770-4050, 4080-4320, and the 3' -terminal 
9990-10686. Unlike the sequencing of cloned DNA, direct 
analysis of PCR amplicons provides sequence information 
for the majority population of amplified cDNA molecules, 
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and therefore for the majority population of template RNA 
molecules. 

We observed very early in the project that the 
nucleotide sequence of DEN -2 16681 virus that we 
5 determined at the CDC laboratory differed significantly 
from the sequence of DEN- 2 16681 virus as published by 
Blok et al. (1992) . Our nucleotide sequence, differed from 
that published by Blok et al. (1992) at 60 nucleotide 
positions, which were located throughout the genome. 
10 Amino acid substitutions were encoded by 26 of these 

nucleotide differences. The approximate genomic locations 
of the nucleotide differences are illustrated in the 
schematic diagram in Figure 10. The exact nucleotide 
positions of the discrepancies are shown in Figure 11. 

15 

Nucleotide Sequence Analyses of DEN- 2 PDK-53 cDNA Clones: 

The DEN- 2 PDK-53 virus-specific cDNA clones were 
analyzed by direct sequencing of the double- stranded 

20 plasmid DNA by the thermocy cling method using the Taq 
DyeDeoxy Terminator Cycle Sequencing Kit. The 3 ' -end 
sequence from nucleotide position 10290-10686 was also 
determined by direct sequencing of PCR-derived amplicon 
cDNA. Sequence analysis was performed using the automated 

25 373A DNA sequencing machine. The color-coded sequence 
chromatograms were read by the investigator and the data 
was entered manually into a computer-based spreadsheet. 
The approximate locations of the cDNA errors identified in 
the PDK-53 cDNA clones are illustrated in Figure 12. 
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Our determination of the nucleotide sequence of DEN-2 
PDK-53 virus differed significantly from the PDK-53 
genomic sequence published by Blok et al.. (1992) . The 
latter investigators reported a total of 53 nucleotide 
5 differences that encoded 27 amino acid mutations between 
the nucleotide sequences of the genome of DEN-2 16681 
virus and that of its vaccine derivative, PDK-53 virus. 
They reported the following nonsilent mutations: 1 in the 
capsid, 2 in prM, 1 in M, 3 in E, 3 in NS1, 3 in NS2A, 2 

10 in NS2B, 3 in NS3, 3 in NS4A, 3 in NS4B, and 3 in NS5 . We 
detected only 8 nucleotide mutations between the genomes 
of these two virus strains. One mutation occurred in the 
5'-NC region of the genome, while 7 nucleotide mutations, 
4 of which encoded amino acid substitutions, occurred in 

15 the coding region of the genome as shown in Figure 13 and 
the following table. 
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Table: Summary of nucleotide differences between the 
genomes of DEN- 2 16681 virus and its vaccine 
derivative virus, strain PDK-53. 



10 



Genome 



Po sit ion 



Nucleotide 



l££fil PPK-53 



__ • 



ino Acid 



ifififll PPK-53 



15 



20 



57 a 
524 a 
2055 a 
2579 a 
4018 
5547 
6599 a 
8571 a 



5'-NC 

prM-29 

E-373 

NS1-53 

NS2A-151 

NS3-342 

NS4A-75 

NS5-334 



C 
A 
C 
G 
C 
T 
G 
C 



T 
T 
T 
A 
T 
C 
C 
T 



Asp 
Phe 
Gly 
Leu 
Arg 
Gly 
Val 



Val 
Phe 
Asp 
Phe 
Arg 
Ala 
Val 



a 16681 vs. PDK-53 difference agrees with Blok et al 



(1992) 



25 The few nucleotide positions where our data and those 

of Blok et al. (1992) agreed, in terms of sequence 
differences between the 16681 and PDK-53 viral genomes, 
were distributed throughout the genome. The entire genome 
of DEN-2 16681 virus was cloned and sequenced before we 

30 received the PDK-53 vaccine virus at our laboratory. 
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Except for the 3 1 -terminal cDNA clones #17-#19, every PDK- 
53 virus-specific cDNA clone constructed in our laboratory 
contained at least one nucleotide position of 16681/PDK-53 
sequence difference confirmed by both ourselves and Blok 
5 et al. (1992). Therefore, our PDK-53 virus-specific cDNA 
clones did not result from contamination of PDK-53 - 
specific PGR reactions with 16681 virus -specific cDNA 
template. Our PDK-53 virus-specific cDNA clones, which 
also contained the many sequence discrepancies between our 

10 data and those of Blok et al. (1992), encoded the 

nucleotide sequence from the 5' terminus to nucleotide 
position 10337 of the genome of PDK-53 virus. The 3'- 
terminal 387 nucleotides (10337-10723) of DEN- 2 PDK-53 
virus were identical to those of the parental 16681 virus. 

15 Since none of the PDK-53 virus- specif ic cDNA clones 

covering this region of the genome contained a point of 
confirmed 16681/PDK-53 sequence difference, we repeated 
the PCR amplification of the 3 1 terminus of the PDK-53 
virus genome . This was done to ensure that the 3 1 - 

20 terminal cDNA clones #17-#19 did not result from PCR 
reactions contaminated by 16681 virus -specific DNA 
template. The PCR reaction components were pipetted in a 
room in which DEN cloning had not been performed 
previously, using new micropipetors, newly opened pipet 

25 tips with aerosol barrier, and freshly made stock 

reagents. Direct sequencing of the resulting double- 
stranded PCR cDNA amplicon confirmed that the 3' -387 
nucleotides of DEN- 2 PDK-53 virus was indeed identical to 
the 3' terminus of the 16681 parent. 
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The finalized nucleotide sequence of DEN- 2 virus, 
strain 16681, including the nucleotide and amino acid 
mutations identified for DEN -2 PDK-53 virus, is shown in 
Figure 14. 



Construction of DEN-16681 Full-Lenath Clone in pUC19: 



For the construction of the full genome -length cDNA 
clone of DEN- 2 16681 virus, 5 of the sequence - 

10 characterized PCR-amplif ied cDNA subclones were selected 
for splicing. However, clone #5 contained a cDNA "error" 
that was not readily spliced out with the existing clones 
This error, which was a C-to-T mutation at nucleotide 
position 1730 and encoded a nonsilent Thr-to-Ile amino 

15 acid substitution at E-265, was incorporated into the F2 
construct. The intermediate F2 construct was the result 
of splicing the F2-Sal clone (#5) Sphl/Hpal fragment to 
the Sal-F2 clone (#7) Hpal/Kpnl fragment in the MCS of 
plasmid pUC18 (Figure 15) . To correct the error, a new 

20 PCR amplicon was made using primers D2-1261 and CD2-2955. 
Resulting clones in the TA-vector were sequenced, and the 
correct Sphl/Hpal fragment of a new clone was substituted 
for the faulty Sphl/Hpal fragment of the original F2 
construct (Figure 16) . The corrected F2 clone was 

25 designated F2-C. 

The relevant cDNA clones of DEN- 2 16681 virus were 
spliced together via a series of intermediate ligation 
products in the MCS of pUC18 to yield Fl/3/4/5, which 
contained all of the genome except for the Sphl-Kpnl 1380 

30 4493 region present in clone F2-C. Multiple attempts to 
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ligate the F2-C Sphl/Kpnl cDNA fragment into Fl/3/4/5 in 
pUC18 failed. The cDNA insert of Fl/3/4/5-pUC18 was then 
transferred to the MCS of pUC19, resulting in Fl/3/4/5 - 
pUC19. This operation simply reversed the orientation of 
5 the cDNA insert within the context of the pUC plasmid. 
Ligation of Sphl/KpnI-cut Fl/3/4/5-pUC19 and F2-C 
Sphl/Kpnl insert readily yielded transf ormants in E. coli 
Xll-Blue that contained the full-length cDNA clone 
Fl/2/3/4/5-pUC19, which was designated pD2/IC-20. The 

10 detailed splicing procedures for pD2/IC-20 are illustrated 
in Figure 17. The orientation-specific cloning of the 
full genome- length cDNA in pUC19 rather than pUC18 is 
diagrammed in Figure 18. 

The full genome-length cDNA of DEN- 2 16681 virus was 

15 cloned into the MCS of pUC19. Apparent full genome-length 
viral mRNA was transcribed from linearized pD2/IC-20. 
This transcribed product failed to yield infectious virus 
following electroporation of BHK-21 cells. Most of the 
cDNA in the pD2/IC-20 clone was resequenced, and several 

20 cloning artifacts, including a fatal single -nucleotide 

deletion, were identified. Original subunit intermediate 
cDNA constructs in pUC18 were resequenced to confirm that 
they possessed the correct sequence and corrected where 
necessary. The corrected primary cDNA clones Fl, F2-C, 

25 and F3/4/5 were then ligated into the low- copy plasmid 
pBR322, rather than the high copy-number pUC18 plasmid. 
It was envisioned that the cDNA would be more stable in a 
slower-replicating plasmid in E. coli. 

To enable more straightforward cloning into pBR322, 

30 the MCS of pUC19 was spliced into the pBR322 plasmid 
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(Figure 19) . This resulted in plasmids pBRUC-138 and 
pBRUC-139 containing the pUC MCS in both orientations 
within the pBR322 plasmid backbone. The SphI site was 
removed from both pBRUC plasmids by cutting with SphI, 
5 blunt ending of the cut ends using T4 DNA polymerase, and 
then ligating the ends back together. This was necessary 
for the construction of the full-length cDNA clone because 
SphI is one of the cDNA restriction/splicing sites for the 
clone . 

10 The F3/4/5-F cDNA clone of DEN- 2 16681 virus, which 

had been verified by sequence analysis, was cloned into 
pBRUC-139 (SphI') (Figure 20) . Following this ligation, 
the Fl-E and F2-C cDNA clone fragments were also moved 
into the pBR322 backbone to construct the full genome- 

15 length cDNA clone, pD2/IC-30P (Figure 20) . This 

recombinant plasmid was replicated successfully in both 
TB-1 and MC-1061 strains of E. coli. 

Construction of DEN- 2 PDK-53 Infectio us cDNA Clone; 

20 

The full-length infectious clone of DEN- 2 16681 virus 
was used in the construction of the infectious clone for 
PDK-53 virus. Since the 3 1 -noncoding regions of the 
genomes of both viruses are identical, and the amino acid 
25 sequences of the translated precursor polyproteins encoded 
by genome nucleotide positions 6646-10269 are identical in 
both viruses, the infectious clone of PDK-53 virus was 
constructed using the 16681 3' -end cDNA from the Nhel site 
at nucleotide position 6646 to the 3' terminus of the 

■ 

30 genome (Figure 21) . After correcting a cDNA error in the 
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PDK-53 F3-3C subunit clone, this fragment and the F2-16B 
cDNA fragment were ligated into the infectious clone 
backbone to construct the DEN- 2 PDK-53 virus-specific 
full-length cDNA clone, pD2/IC-130V (Figure 21) . 

5 

Transcription of Viral _aRNA from DEN- 2 Infectious cDNA 
Clones ; 

Viral genomic RNA extracted from gradient -purified 

10 virions was analyzed by nondenaturing RNA agarose gel 
electrophoresis to observe the level of RNA degradation 
and the limits of detectability by ethidium bromide 
staining. Figure 22 shows an agarose gel electropherogram 
for 22-383 ng of viral genomic RNA obtained from purified 

15 preparations of wild-type DEN- 2 16681 virus and wild-type 
Venezuelan equine encephalitis (VEE) virus, strain 
Trinidad donkey. Although degradation of the RNA is 
visible as a spectrum of smaller molecular weight nucleic 
acid (smear in Figure 22) , definite full-genome length RNA 

20 bands are clearly visible. This smear of nucleic acid is 
probably also due, in part, to multiple conformations of 
the single -stranded RNA molecules which migrate through 
the gel at different rates. The relative gel migration of 
the single- stranded RNA does not correlate directly with 

25 the sizes of the double -stranded molecular weight marker 

DNA bands (MW, Figure 22); the VEE and DEN- 2 viral genomes 
are 11,447 and 10,723 nucleotides in length, respectively. 
BHK-21 and C6/36 cells were transfected successfully by 
electroporation with 2000, 500, 100, 10, 1, and 0.1 ng of 

30 viral genomic RNA extracted from purified VEE or DEN- 2 
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16681 virus/ as indicated by development of CPE, 
expression of viral proteins detected by indirect 
immunofluorescence tests using virus-specific antibody, 
and/or by plaque titration of infectious virus from the 
5 transfected-cell culture medium. RNA quantities of 1 ng 
or less were essentially undetectable in the ethidium 
bromide -stained agarose gel system we used. Therefore, 
authentic RNA transcripts derived from full genome-length 
cDNA and visualized in agarose gel electropherograms of 

10 transcription reactions should be infectious for BHK-21 
cells by elect roporat ion. 

Investigators previously constructed an infectious 
cDNA clone for VEE virus as reported by Kinney et al. 
(1989) . RNA transcription reaction conditions that 

15 yielded high quantity and quality of infectious mRNA 
transcripts from the pVE/IC-92 infectious clone of VEE 
virus failed in multiple attempts to transcribe RNA from 
the pD2/IC-20 clone of DEN- 2 16681 virus. Figure 23 shows 
an agarose gel electropherogram that demonstrates 

20 successful transcription of RNA from the VEE clone, but 
not pD2/IC-20. 

In an attempt to improve RNA transcription from the 
DEN- 2 clone, commercial transcription kits were purchased. 
The Megascript transcription kit supplied by Ambion also 

25 failed to transcribe RNA from the DEN clone. However, the 
Ampliscribe kit obtained from Epicentre Technologies 
enabled efficient transcrip-tion of RNA from the DEN- 2 
clone (Figure 24) . 

The success of the Ampliscribe kit apparently was due 

30 to the high concentration of ribonucleotides and a very 
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high 7 but proprietary, concentration of T7 RNA polymerase. 
The RNA transcribed from pD2/IC-20 was not infectious. 
However, viral mRNA transcribed from DEN-2 16681 clone 
pD2/2-IC30P and PDK-53 clone pD2/IC-130V was infectious 
5 (Figure 25) . 

Viral mRNA transcripts from both replicates of 
pD2/IC-30P (A and D) and pD2/IC-130V (F and J) were 
infectious, producing viable infectious virus in 
electroporated BHK-21 cells. Figure 26 shows RNA 
10 transcripts from pD2/lC-20, pD2/IC-30P, and pD2/IC-130V. 

CottPtnjtctiQn gfi PEN- 3 l^ggi/PPK-Sj Chimeric <?pflft C^gn^g? 



Several chimeric full-length cDNA clones were derived 
15 from the pD2/lC-30P and pD2/lC-130V clones. All clones 

were constructed in the pBRUC-139 derivative of the pBR322 

plasmid vector. E. coli strains XLl-Blue, MC-1061, and 

TB-1 were successfully transformed with ligated 

recombinant plasmids containing full genome- length cDNA. 
20 Viable virus was derived from all of the indicated clones. 

The evolutionary tree for the chimeric viruses is 

diagrammed in Figure 27. 

Details concerning the splicing strategies for the 

chimeric clones are shown in Figure 28. Appropriate cDNA 
25 fragments were cut and ligated together at the internal 

Sail, SphI, Kpnl, and Nhel sites as well as at the 5'-SstI 

and 3'-XbaI sites. 

Visible prototype and chimeric viruses were derived 

from each of the clones indicated in Figure 28 by 
30 electroporation of BHK-21 cells with viral genome -length 
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raRNA transcribed from linearized plasmids. Seed stocks of 
these viruses were prepared by centrifuge-clarification of 
the cell culture medium, adjustment of the FBS 
concentration to 10% , and freezing of seed aliquots at 
5 -70°C. Virus concentrations were determined by plaque 

titration of the virus seeds in monolayer cultures of Vero 
cells. The results of these virus titrations are shown in 
the following table. 
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Table. Plaque titration of DEN- 2 16681 and PDK-53 

stock seed viruses and chimeric viruses 
recovered from BHK-21 cells transfected 
5 with infectious clone-derived viral mRNA 

transcripts . 



VirVS (PFU/ml) Genotype* 

10 





DEN- 2 16681 


8 . 0 


X 


10 7 


c 


D 


F G 


L 


R G V 




DEN- 2 PDK-53 


5.1 


X 


10 3 


t 


V 


. D 


F 


. A . 




D2/IC-30P-A 


3 . 6 


X 


10 5 












15 


D2/IC-30P-A2 


1.7 


X 


10 5 














D2/IC-130V-F 


4.0 


X 


10 s 


t 


V 


. D 


F 


. A . 




D2/IC-130V-J 


2.2 


X 


10 5 


t 


V 


. D 


F 


. A . 


20 


D2/IC-130V2-1 


2.8 


X 


10 5 


t 


V 


• • 


♦ 


. A . 




D2/IC-130V2-7 


8.8 


X 


10 4 


t 


V 


* . 


* 


. A . 




D2/IC-31-12 


2.1 


X 


10 5 


t 


V 










D2/IC-31-15 


3.2 


X 


10 5 


t 


V 








25 






















D2/IC-32-A 


1.4 


X 


10 6 


* 




. D 


F 


■ ■ * 




D2/IC-32-G 


1.2 


X 


10 6 




• 


. D 


F 


• • • 




D2/IC-33-C 


9.6 


X 


10 4 










. A . 


30 


D2/IC-33-P 


1.9 


X 


10 5 










. A . 




D2/IC-321-L 


1.1 


X 


10 6 


t 


V 


. D 


F 


... 




D2/IC-321-N 


7.6 


X 


105 


t 


V 


. D 


F 


... 


35 


D2/IC-323-B 


7.2 


X 


10 5 


• 


» 


. D 


F 


. A . 




D2/IC-323-I 


8.8 


X 


10 5 






. D 


F 


. A . 




D2/IC-31-57-5 


2.4 


X 


10 s 


t 










40 


D2/IC31-524-D 


3.2 


X 


10 4 


c 


V 









a Genotype is designated in small case for the 
virus-specif ic 5 1 -noncoding nucleotide and in 
upper case single- letter amino acid abbreviation 
45 for amino acids encoded by virus -specific 

nucleotide mutations. Dots represent nucleotide 
or amino acid sequence identity with DEN- 2 16681 
virus. 



50 
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To establish the validity of the clone -derived 
chimeric viruses, relevant genomic cDNA fragments were 
amplified directly from seed viruses by PGR and spot- 
sequenced. The results are shown in Figure 29. This 
5 validation process is ongoing. Except for D2/IC-31-524 
virus, appropriate cDNA insert regions in chimeric viruses 
have been confirmed by sequence analysis. Except for 
D2/IC-30P, D2/IC-130V, and D2/IC-31-57, which have been 
fully confirmed, clone-derived chimeric viruses have yet 

10 to be spot -sequenced in a recipient clone-derived cDNA 

region to definitely establish the chimeric nature of the 
virus. The recipient clone is the recombinant plasmid 
backbone into which a cDNA fragment, the insert fragment, 
from a heterologous donor clone is spliced. Where 

15 duplicate clone-derived viruses were obtained, both 

viruses of a given genotype were spot -sequenced, and both 
gave the same result, which is shown in Figure 29. 

Submission of PD2/IC-30P and PD2/IC-130V to ATCCt 

20 

Patent deposits of the full genome-length cDNA clones 
of DEN- 2 16681 and PDK-53 viruses were submitted to the 
American Type Culture Collection (ATCC) , Rockville, 
Maryland, U.S.A. Both pD2/lC-30P-A and pD2/lC-130V-F were 
25 grown overnight in E. coli TB-1 cells. Six cryogenic 
vials containing 1 ml each of frozen cell culture in 
10% glycerol were submitted by dry ice shipment. Prior to 
shipment, plasmid was extracted from a 1 ml aliquot of 
each virus -specific culture. The recombinant full-length 
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plasmid was recovered from the cells as shown in Figure 
30. 

The pD2/IC-30P-A deposit with the ATCC was assigned 
accession number ATCC 69826, and the pD2/lC-130V-F deposit 
5 with the ATCC was assigned accession number ATCC 69825. 
Date of deposit was May 25, 1995. 

Construction of Chimeric DEN- 2/1. -2/3. and -2/4 



and DEN- 2/4 viruses from recombinant full genome -length 
cDNA clones containing the genetic background of DEN- 2 
PDK-53 virus and the prM and E genes of the DEN-1, DEN-3, 
and DEN-4 candidate vaccine viruses, respectively. To 

15 accomplish this, the prM and E genes of the vaccine 

viruses were amplified by PCR. Because our laboratory has 
been establishing a sequence database to analyze the 
molecular epidemiology of several flavi viruses, including 
all of the serotypes of dengue virus, the primers used for 

20 cDNA amplif ication in the PCR were readily available at 
our laboratory. The amplified cDNA molecules were 
sequenced directly, thus providing the sequence of the 
population of virions in the virus seed. The amplified 
cDNA amplicons for the DEN-1, DEN-3, and DEN-4 vaccine 

25 viruses have all been cloned into the pGEM-5Zf TA-vector. 
The cloned cDNA has not been analyzed by sequencing, since 
it will be necessary to rederive the cDNA amplicons by PCR 
to incorporate appropriate RENZ cleavage sites within the 



amplicon for splicing into the full-length cDNA backbone 
30 of DEN- 2 PDK-53 virus. The partial nucleotide sequences 



Infecti 




10 



We contemplate deriving chimeric DEN-2/1, DEN-2/3, 
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of the genomes of the DEN-1, DEN- 3 , and DEN- 4 vaccine 
viruses were aligned with the DEN- 2 PDK-53 sequence. All fc 
four sequences are aligned with the nucleotide sequence of 
DEN- 2 16681 virus and its deduced amino acid sequence in 
5 Figure 31. The deduced amino acid sequences of the DEN 
viruses are aligned in Figure 32. 

It is readily evident from the aligned nucleotide 
sequence data that useful restriction enzyme sites in the 
DEN- 2 virus -specif ic cDNA are not conserved in the DEN-1, 

10 DEN- 3 , and DEN-4 viruses. Therefore, splicing sites must 
be engineered into the cDNA to enable the splicing of 
heterotypic DEN-1, DEN-3 , and DEN-4 prM and E genes into 
the DEN -2 backbone. It is not yet clear precisely how the 
nonstructural proteins of flaviviruses interact with the 

15 structural proteins during intracellular maturation of the 
virus. Furthermore, the interaction of the capsid protein 
with the genomic mRNA molecule in the nucleocapsid of the 
virion has not been defined. However, coexpression of the 
E and prM proteins has been more successful than 

20 expression of E alone in expression systems in vitro. The 
DEN- 2 nonstructural proteins are involved in all virus - 
specific intracellular polyprotein processing and 
replication of viral mRNA, and the predominant portion of 
the mRNA genome interacting with the capsid protein is 

25 presumably, but not necessarily, DEN- 2 virus-specif ic . 
For these reasons, our strategy is to splice in the prM 
and E genes of DEN-1, DEN-3, and DEN-4 viruses very 
precisely, while maintaining the DEN- 2 context of the 
bracketing capsid and NS1 protein regions. 
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The strategies for creating Xhol and Xbal splice 
sites at the 5 1 end of the prM gene and near the 3' end of 
the E gene are illustrated in detail in Figures 33 and 34, 
respectively. Briefly, mutagenic primers containing the 

5 appropriate RENZ site are utilized in PCR reactions to 
synthesize new cDNA for the prM and E genes of all four 
viruses, A DEN- 2 PDK-53 virus-specific cDNA cassette 
plasmid, designated pD2V-CAS12, containing the genome 
region from the 5 1 terminus through nucleotide position 

0 4696 is constructed via intermediate plasmid constructs 
pFl-Xho and pF2-Xba as illustrated in Figures 35 and 36. 
The Xhol/Xbal cDNA fragments cut directly from DEN-1, DEN- 
3, and DEN- 4 virus- specif ic amplicons synthesized by PCR 
using the mutagenic primers are ligated into the pD2V- 

5 CAS12 cassette plasmid to create subclone chimeras. The 
Sstl/Kpnl fragment of the resulting pDlV-CAS12, pD3V- 
CAS12 , and pD4V-CAS12 cassettes are moved into pD2/lC-130V 
restricted with Sstl/Kpnl to create the chimeric full 
genome-length cDNA clones (Figure 36) . 

0 

Infectious cDNA clones permit the directed 
engineering of viral genomes. Depending on their 
viability in terms of ability to replicate in cell 

5 culture, infectious clone-derived viruses can be modified 
by incorporating point mutations, multiple mutations, 
deletions, gene regions of related or heterologous 
viruses, or nonviral genes. Infectious cDNA clones have 
been developed for many RNA viruses, including 

0 flavi viruses DEN- 4 (Lai et al., 1991), yellow fever (Rice 
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et al., 1989), Kunjin (Khromykh and Westaway, 1994), 
Japanese encephalitis (Sumiyoshi et al., 1992), and TBE 
(unpublished data) . We describe herein the development of 
infectious cDNA clones for DEN- 2 16681 virus and its 
5 candidate vaccine derivative, strain PDK-53. We also 
describe the construction of chimeric viruses, 
incorporating the prM and E genes of candidate DEN-1, DEN- 
3, and DEN-4 vaccine viruses within the genetic background 
of the DEN- 2 PDK-53 vaccine virus. 

10 Although the candidate vaccine viruses developed at 

Mahidol University are currently the best live DEN virus 
vaccine candidates in terms of immunogenicity and safety 
in adult humans, the DEN-1, DEN- 3 , and DEN-4 vaccine 
viruses replicate poorly in cell culture and possess low 

15 infectivity in humans, requiring up to 2000-fold more PFU 
of virus to infect and immunize humans than is needed for 
the DEN- 2 PDK-53 vaccine virus. The low inf ectivities of 
these viruses have significant implications for vaccine 
production in cell culture, potentially decreased 

20 immunogenic efficacy, and more rapid inactivation under 
conditions of a poorly maintained cold chain in tropical 
countries where dengue viruses are endemic. 

The purpose of engineering chimeric DEN vaccine 
viruses is to enhance the replicative ability and 

25 immunogenicity of the DEN-1, DEN- 3 , and DEN-4 vaccine 
viruses. A primary assumption has been that the 
attenuated DEN-2 PDK-53 vaccine virus replicates to 
appropriate levels in cell culture. In fact, it does 
appear that the genome of DEN-2 PDK-53 virus is eminently 

30 suited to serve as the genetic backbone for chimeric 
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viruses containing the prM and E genes of DEN-1, DEN-3, 
and DEN- 4 vaccine viruses. We have recently completed 
growth curves for DEN- 2 16681 virus, DEN- 2 PDK-53 virus, 
and their infectious clone derivative viruses in LLC-MK, 
cells* 

The viruses were titrated in Vero cell monolayers. 
These data are shown in the following table: 



10 



15 



Virus 



Maximum 

Titer 
(PFU/ml) 



Maximum 
Titer 



DEN- 2 16681 


2.6 


X 


10 8 


10 


D2/IC-30P-A 


1.7 


X 


10 7 


8 


D2/IC-30P-A2 


6.6 


X 


10 7 


7 


DEN- 2 PDK-53 


3.8 


X 


10 7 


9 


D2/IC-130V-F 


2.9 


X 


10 7 


7 


D2/IC-130V-J 


1.7 


X 


10 7 


7 



The DEN- 2 PDK-53 virus and its infectious clone derivative 
20 viruses grow to approximately 10 7 PFU/ml in LLC-MK 2 cells, 

* 

about as well as the DEN- 2 16681 virus. 

A second assumption is that the chimeric DEN viruses 
will be viable and the DEN- 2 PDK-53 virus -specific 
replication machinery will significantly increase 

25 replication of the chimeric viruses in cell culture and 
increase their infectivity and immunogenicity in humans 
relative to the wild-type vaccine viruses. The high 
degree of conservation of amino acid sequences among the 
polyproteins of the four DEN viruses should ensure that 

30 the chimeric viruses will be viable. The level of 
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replication attained by the chimeric DEN viruses is 
determined empirically, as was determined for the DEN- 2 
PDK-53 infectious clone derivative virus. 

Bray et al. (1991) constructed chimeric DEN-4/1 and 
5 DEN-4/2 viruses that appeared to appropriately express 

DEN-1 and DEN- 2 structural protein antigens in the genetic 
background of DEN-4 virus. These investigators spliced 
much of the S'-noncoding region, and the capsid, prM and E 
genes of DEN-1 or DEN- 2 virus into the full-length cDNA 

10 clone of DEN-4 virus. The near 3 1 -terminal splice site 

they chose in the E gene is very close to that proposed by 
us in our project. These chimeric viruses replicated very 
slowly relative to the wild- type viruses. The authors 
attributed this slow replication to possible suboptimal 

15 gene expression, assembly, and/or maturation due to 

incompatibility of heterotypic genes or RNA packaging in 
the nucleocapsid. Another possibility is that cDNA errors 
may have been incorporated into their constructs. In 
contrast, Pletnev et al. (1993) engineered chimeric 

20 viruses between PEN-4 virus and tick-borne encephalitis 
(TBE) virus, which is a very distant flavi virus relative 
of DEN viruses. Thus, DEN virus chimeras may be derived 
that are viable. 

A third assumption is that our chimeric DEN viruses 

25 will express the appropriate structural protein antigens 
of DEN-1, DEN- 3 , and DEN-4 viruses, and that vaccinees 
will respond with development of appropriate serum titers 
of DEN-1, DEN- 3 , and DEN-4 neutralizing antibodies 
following immunization with the chimeric viruses . We 

30 describe the insertion of the prM and E genes of DEN-1, 
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DEN- 3, and DEN- 4 viruses into the DEN- 2 clone. Thr-to-Ser 
amino acid substitutions near the amino terminus of the 
prM protein in DEN-2, DEN-2/1, DEN-2/3, and DEN-2/4 
viruses resulting from mutagenesis to create the Xhol site 
5 of the cassettes should be conservative in nature and 
affect the phenotype of derived viruses minimally, if at 
all. Alternatively, a unique Mlul site (ACGCGT) could be 
created via a single, silent A-to-G point mutation at 
nucleotide position 453 in the DEN-2 clone. The Mlul site 
10 immediately preceding the T7 promoter could easily be 

eliminated by cutting the clone with Mlul, blunt-ending, 
and religation. The clone-derived DEN-2 and chimeric 
viruses would then have the prM amino -terminal sequence 
"FHLTTR." 

15 The carboxyl- terminal 24 amino acids of the E 

glycoprotein of all of the infectious clone -derived 
viruses will be those of the DEN-2 PDK-53 virus. 
Therefore, the E protein of all of the chimeric viruses 
will have amino acid mutations in this region. Yet, the 

20 carboxyl -terminal 39 amino acids of the DEN virus E 

protein comprise membrane -spanning, transmembrane domains. 

m 

In all enveloped viruses, the transmembrane domains of the 
integral viral proteins of related viruses are quite 
variable in amino acid sequence. It has often been noted 
25 that the important conserved feature of amino acids in 
this domain lies in their hydrophobic, "lipid- loving" 
nature rather than in the absolute sequence. Creation of 
a Mrol site (TCCGGA) or a unique Agel site (ACCGGT) at 
nucleotide positions 2281-2286 in the DEN-2 clone would 



> 
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result in amino acids "SG" or "TG H , respectively, at 
positions E-449 and E-450 in the clone-derived viruses. 

The E protein of all flaviviruses share a similar 
gross tertiary structure that is indicated by the absolute 
5 conservation of the 6 Cys residues in the prM protein and 
in the 12 Cys residues in the ectodomain (the region 
located on environment side of the viral lipid envelope) 
of the E protein of DEN, Japanese encephalitis, West Nile, 
Murray Valley encephalitis, St. Louis encephalitis, 

10 Kunjin, yellow fever, TBE, Langat, and Powasson 

flaviviruses (data not shown) . Cys residues are involved 
in intrachain Cys-Cys disulfide bonds that determine the 
overall structure of the protein. We fully expect the 
DEN-2/1, DEN-2/3, and DEN-2/4 chimeric viruses to be 

15 viable and to replicate more efficiently than the wild- 
type DEN-1, DEN- 3, and DEN- 4 vaccine viruses, 
respectively. Furthermore, chimeric recombinants 
involving the genetic backbone of one flavi virus and the 
structural genes of a variety of different flaviviruses 

20 may also be viable, as has been demonstrated for DEN- 4 /TBE 
virus recombinants (Pictnev et al., 1993). Such 
recombinant viruses offer the potential opportunity to 
engineer chimeric vaccine viruses for a number of 
f lavivirus-associated diseases within the genetic 

25 background of a single flavi virus. The X-ray 

crystallographic structure of the E glycoprotein of TBE 
flavivirus has recently been published (Rey et al., 1995). 
This development has significant implications for the 
future design of flavivirus molecular vaccines. 
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A fourth assumption is that the chimeric DEN viruses 
will retain the attenuated phenotype of the wild-type DEN- 
1, DEN- 3 , and DEN- 4 vaccine viruses, despite enhanced 
replicative efficacy provided by the more efficient 
5 nonstructural genes and 5 1 and 3 1 noncoding regions of the 
DEN- 2 PDK-53 virus. This presupposes that DEN -2 PDK-53 
virus has attenuating mutations in the noncoding regions 
or in the nonstructural genes and/or that attenuating 
mutations occur in the prM/E region of the genomes of DEN- 

10 1, DEN-3, and DEN- 4 viruses. Mutations in essentially any 
region of the viral genome may be capable of attenuating a 
virulent virus. This has been demonstrated for a number 
of viruses including polio virus, VEE virus, and Theiler's 
virus. Noncoding as well as protein coding regions may be 

15 involved in attenuation. Attenuating mutations in the 

envelope proteins of enveloped viruses are common (Barrett 
et al. , 1990) . 

The nucleotide mutations in DEN- 2 PDK-53 virus at 
genome nucleotide positions 57 (5 1 -noncoding region), 524 

20 (prM) , 2579 (NS1) , 4018 (NS2A) , and 6599 (NS4A) may be 
involved in attenuation of the virus. Unless the prM 
amino acid mutation is the only mutation affecting 
virulence of the virus, the DEN- 2 PDK-53 genetic 
background, within which the structural genes from 

25 heterologous viruses will be expressed, does itself 
possess genotypic markers of attenuation. We can 
determine the genetic loci involved in the attenuation of 
the DEN- 2 PDK-53 virus by analyzing DEN -2 16681/PDK-53 
recombinant viruses derived from chimeric 16681/PDK-53 
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full-length clones. The E gene of DEN- 2 PDK-53 virus 
contains no attenuating mutations. 

Although investigators have sequenced the structural 
genes of numerous DEN- 3 virus strains (e.g., Lanciotti et 
5 al. # 1994), none have sequenced the DEN- 3 16562 virus, 
parent to the DEN- 3 PCMK-30/FRhL-3 vaccine virus. After 
determining the sequences of the prM and E genes of this 
virus, we can establish if any amino acid mutations have 
occurred within these genes in the DEN- 3 vaccine virus. 

10 By comparison, nucleotide sequence information for the 
parental DEN-1 and DEN-4 viruses have been determined 
(unpublished data (parental DEN-1 virus) ; Lanciotti et 
al., submitted for publication (parental DEN-4 virus)). 
The nucleotide sequences of the E gene of DEN-4 1036 virus 

15 and both prM and E genes of DEN-1 16007 virus have been 
determined. The following amino acid mutations were 
identified: 
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Amino Acid 



E Protein 

Virus Amino Acid Parent Vaccine 

5 type Position Strain Strain 



DEN-1 E-130 Val . Ala 

E-203 Glu Lys 

E-204 Arg Lys 

10 E-225 Ser Leu 

E-384 Ala Glu 

E-477 Met Val 

DEN-4 E-345 Glu Lys 

15 E-364 Val Ala 



There were six amino acid mutations in the E protein of 
DEN-1 16007 PDK-13 virus and 2 mutations in that of DEN-4 
1036 PDK-48 virus* There were no amino acid substitutions 

20 in the prM protein of the DEN-1 vaccine virus. Glu-to-Lys 
and Lys-to-Glu amino acid substitutions, as occur at DEN-1 
E-203 and DEN-4 E-345, are common motifs in sequence 
comparisons between parent viruses and their vaccine 
derivatives. It is likely that the heterologous prM/E 

25 cDNA inserts in recombinant full-length cDNA clones will 
transport genetic loci of attenuation into the chimeric 
DEN- 2/1, DEN- 2/3, and DEN- 2/4 virus derivatives. The 
optimum scenario for the chimeric viruses involves 
increased replication ability in the presence of genetic 

30 loci of attenuation in the heterologous DEN-1, DEN- 3, and 
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DEN-4 structural gene inserts within the genetic 
background of the DEN- 2 PDK-53 virus. 

Nucleotide sequence analysis of expressed genes is 
essential. The error rate in the original RT/PCR derived 
5 cDNA clones of DEN- 2 16681 virus was 8.2 x 10" 4 , that is 1 
cDNA error for every 1227 nucleotides of cloned, sequenced 
cDNA. In a previous sequencing project involving VEE 
virus and employing classical, non-PCR cDNA synthesis 
methodology, the error rate was calculated to be 3 . 9 x 10~ 4 

10 or 1 error for every 2543 nucleotides of cloned, sequenced 
cDNA. These errors are due to nucleotide incorporation 
errors by reverse transcriptase during first strand cDNA 
synthesis and perhaps to the cloning of individual 
variants within the original population of virions. 

15 Unlike many DNA polymerases, RNA polymerases and reverse 
transcriptase have no editing function. Incorrect 
nucleotides incorporated during strand elongation are not 
detected or removed before continuing. The Taq DNA 
polymerase is also known to incorporate errors into PCR 

20 amplicons. Thus, at least 4-8 cDNA "errors" can be 
expected to occur in 10 kb of cloned cDNA. We have 
observed the incorporation of spurious in- frame 
termination codons (TAA, TAG, TGA) in cDNA clones derived 
from both VEE and DEN viruses. Premature termination of 

25 amino acid translation would result in a truncated protein 
and would undoubtedly be a lethal mutation for a candidate 
infectious clone. Much of the utility of genes expressed 
in vitro is compromised when those genes are not 
characterized by sequence analysis. If cDNA errors occur 

30 in candidate infectious cDNA clones, it may be difficult 
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to determine if phenotypic effects of directed mutations 
are due to the engineered mutation, to cDNA errors, or to 
synergistic action or compensation between errors and 
engineered mutations. 
5 Wiktor et al. (1984) reported that two cDNA errors 

caused spurious amino acid substitutions in rabies virus 
glycoprotein expressed in recombinant vaccinia virus and 
resulted in expression of non-authentic rabies 
glycoprotein. After sequence analysis and correction of 

10 the cDNA, expression of authentic rabies glycoprotein was 
obtained. A faulty cDNA clone may behave as expected in 
one circumstantial context, yet behave very 
inappropriately and be highly misleading in a different 
context. A faulty structural gene cDNA clone of the 

15 virulent VEE Trinidad donkey (TRD) virus that was 

expressed in recombinant vaccinia virus was essentially 
authentic by monoclonal antibody analysis of expressed VEE 
virus -specific proteins and by protection of immunized 
mice from challenge with virulent VEE virus (Kinney et 

20 al., 1988a; Kinney et al., 1988b). However, incorporation 
of this cDNA" clone into an infectious cDNA clone of VEE 
virus completely abrogated the virulence of the clone- 
derived virus, whereas the corrected cDNA fragment 
resulted in derivation of virulent virus (Kinney et al., 

25 1993). 

Although Lai et al. (1991) originally derived their 
infectious clone of DEN- 4 virus from sequence 
characterized subunit cDNA clones (Zhao at el., 1986; 
Mackow et al., 1987), the original full-length clone was 
30 not infectious (Lai et al., 1991). While these 
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investigators indicated that they sequenced both strands 
of much of the cloned genomic cDNA, they did not indicate 
that they sequenced more than a single clone for a given 
cDNA region. Nucleotides encoding cDNA errors will be 
5 confirmed on both cDNA strands, but will not be identified 
as errors unless the sequences of two or more independent 
cDNA clones covering the same region of the genome are 
sequenced. The functional full-length clone of DEN- 4 
virus was obtained by repeated splicing of large new cDNA 

10 fragments into the full-length clone until a functional 

clone was obtained. The authors did not indicate that the 
newly cloned regions were characterized by nucleotide 
sequence analysis (Lai et al., 1991). It is probable that 
the slowed replication of the DEN-4/1 and DEN-4/2 chimeric 

15 viruses relative to wild-type viruses reported by Bray et 
al. (1991) is due to the presence of cDNA artifacts within 
the full-length cDNA clone. The critical importance of 
accurate nucleotide sequence characterization of genes 
expressed in vitro, particularly when those genes are 

20 expressed in the form of infectious cDNA clones, is still 
not widely appreciated by many in the molecular biology 
field. 

Although putative nucleotide sequences for the 
genomes of DEN- 2 16681 and DEN- 2 PDK-53 viruses have been 

25 reported in the literature (Blok et al., 1992), our 
sequence results indicate that the published data is 
highly flawed. Blok et al. (1992) reported 53 nucleotide 
mutations between the two viruses; we determined only 8 
mutations. We analyzed at least two independent cDNA 

30 clones for regions covering the entire genomes of both 
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viruses. The DEN-16681 sequencing project was completed 
prior to receiving the DEN- 2 PDK-53 virus in our 
laboratory, and the nucleotide sequence of the PDK-53 
virus was determined from cDNA amplified directly from 

5 virus present in vaccine vials. 

There are now only two classes of infectious clones 
developed for vaccine flaviviruses that have themselves 
been administered to humans: the infectious clone of 
yellow fever virus, vaccine strain 17D (Rice et al., 1989; 

0 Hahn et al., 1987; Rice et al. # 1985), and the DEN-1, DEN- 
2, DEN-3, and DEN- 4 vaccine derivative infectious clones 
described herein. Both classes of infectious clones have 
the important advantage of being derived from vaccine 
viruses that have been tested for efficacy and safety in 

5 humans. The yellow fever 17D virus vaccine has long been 
one of the most effective human vaccines developed; 
immunization with this virus provides lifelong immunity. 
In the case of DEN virus, it is essential that vaccines 
provide immunity against infection by all four serotypes 

0 of the virus. DEN-1, DEN- 2, DEN-3, and DEN- 4 vaccine 
viruses have been developed at Mahidol University, 
Bangkok, Thailand. All four vaccine viruses have been 
tested in humans and have been demonstrated to be 
immunogenic and safe for human adults. 

5 Replicating vaccines in the form of live, attenuated 

viruses offer distinct advantages in terms of immunogenic 
efficacy due to replicative amplification of viral 
antigens (antigenic mass) in the vaccinees and replication 
in appropriate target tissues. Inactivated or subunit 

0 antigens usually suffer from a lack of sufficient 
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antigenic mass and subsequent failure to stimulate an 
effective immune response. Expression of proteins in 
recombinant vaccinia virus, which replicates primarily at 
the site of inoculation, may provide protection against 
5 parenteral challenge with virulent virus, but may not 
protect against an aerosol challenge. This was 
demonstrated for VEE virus when it was shown .that 
recombinant vaccinia virus expressing the structural 
proteins of VEE virus protected mice from intraperitoneal 
10 challenge, but not intranasal challenge, with virulent VEE 
virus (Kinney et al., 1988b). Immunization with the live, 
attenuated VEE TC-83 vaccine virus, on the other hand, 
provided immunity against both parenteral challenge 
(immunity provided by circulating serum IgG antibody) and 
15 intranasal challenge (mucosal, IgA-base immunity) with 
virulent VEE virus. Furthermore, the level of immunity, 
as measured by titers of VEE virus -specific neutralizing 
antibody, were considerably higher in TC-83 virus- 
immunized mice and horses (the natural epidemic host for 
VEE virus) than in animals immunized with recombinant 
vaccinia/VEE virus (Kinney et al., 1988b; Bowen et al., 
1992) . Similar results have been reported for 
vaccinia/influenza A virus recombinants in rodents (Smith 
et al., 1986). Furthermore, a replicating vaccine virus 
provides the appropriate T-cell epitopes to stimulate 
cell -mediated immunity as well as humoral immunity. T- 
cell epitopes may be lacking in subunit vaccines. In 
short, vaccination with a safe live, attenuated vaccine 
virus provides the optimal immunization of a natural 
infection in terms of the type and level of immunity 
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elicited and the repertoire of viral antigens involved in 
generating the immune response. 

To use the DEN viruses described herein as vaccine 
candidates, it is necessary to rederive the viruses by 
5 transfection of a cell line, such as primary dog kidney, 
certified for human use under conditions of good 
laboratory practice and management to ensure the avoidance 
of potential adventitious agents that might be present in 
uncertified cell lines. Although the cDNA-derived viruses 

10 originate from candidate vaccine viruses that have 

undergone testing in humans, they require recertif ication 
by analysis for possible in vitro phenotypic markers of 
attenuation and by safety testing in small animals and 
probably nonhuman primates. All investigative studies 

15 involving the pathogenesis of DEN virus are hampered by 

the unavailability of a suitable animal model. Certain in 
vitro characteristics are apparently associated with 
attenuation of DEN viruses, but the only definitive test 
is vaccine trial in human volunteers. Vaccine trails 

20 would presumably follow those of the original wild-type 
vaccine viruses developed at Mahidol University. The 
protocol includes titration of the individual vaccine 
virus candidates in adult human volunteers to determine 
the minimal inf ectious/immunogenic dose for each virus. 

25 This is followed by immunization trials with different 

bivalent and trivalent combinations of vaccine virus. The 
final test is the quadravalent vaccine composed of 
appropriate doses of all four vaccine viruses. If the 
preliminary trials are successful, larger trials are 

30 scheduled, and the vaccine viruses are tested in children, 
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who are the primary target for vaccine delivery. 

We describe herein a preferred method to develop an 
infectious cDNA clone for a flavivirus. Optimally, a 
wild- type vaccine virus serves as the template for the 
5 clone construction. Large cDNA fragments are amplified 
from the genomic mRNA by PCR using virus -specific primers 
and directly cloned into a TA-vector or into the MCS of a 
low- copy number plasmid following restriction of the 
amplicon cDNA. The low-copy pBRUC-139 vector contains the 

10 MCS of pUC19 to permit convenient cloning of cDNA using a 
variety of RENZ sites. Other low- copy plasmids are 
available. The bacteriophage T7 or SP6 promoter is 
usually engineered into the 5 1 -terminal mRNA-sense 
amplimer, and a unique RENZ site for linearization of the 

15 recombinant plasmid containing the full-length cDNA must 
be engineered into the 3 -terminal complementary 
(negative) -sense amplimer. Exhaustive nucleotide analysis 
of the cDNA clones is desirable. 
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SEQ. 
ID 

H£j PRIMER 



MER/SENSE SEQUENCE 



3 
4 
5 
6 
7 



8 



9 

10 
11 

12 

13 . 

14 

15 

16 

17 

18 

19 

20 

21 

22 



pUC/M13-P5 
pUC/M13-P5B 
PUC/M13-P3 
PUC/M13-P3B 
D2-1-ECO.T7 75/+ 



D2-SMT71 



D2-1 
D2-28 
D2-134 
CD2-250 
D2-274 
CD2-378 
D2-528 
CD2-616 
D2-616 
CD2-618 
CD2-771 
D2-847 
D2-996 
CD2-996 



25/+ 
27/+ 
25/+ 
27/- 



77/+ 



24/+ 
34/+ 
28/+ 
26/- 
32/+ 
25/- 
25/+ 
26/- 
25/+ 
25/- 
25/- 
25/+ 
27/+ 
27/- 



5 1 -CCCAGTCACX5ACGTTGTAAAACGAC-3 ' 

5 ' - GGATGTGCTGCAAGGCX3ATTAAGTTGG - 3 ' 

5 ' - TGAGCGGATAACAATTTCACACAGG - 3 • 

5 * - GGCTTTACACTTTATGCTTCCGGCTCG - 3 1 

5 1 -GCX3GATATTG/GAATTC/TCTAGA/ 

AATTTAATACGACTCACTATA/ 
AGTTGTTAGTCTACGTGGACCGACAAAGACAG- 3 * 

{5 '-Fill /EcoRI /Xbal/T7 Promoter/ 
5 '-end of DEN- 2) 

5 ' - CCAGT/GAATTC/GAGCTC/ACX3CGT/ 
AAATTTAATACGACTCACTATA/ 
AGTTGTTAGTCTACGTGGACCGACAAAGACAG- 3 1 

(5 1 -Pill/EcoRI/SstI/MluI/T7 Promoter/ 
5' -end of DEN- 2) 

5 ■ - AGTTGTTAGTCTACGTGGACCGAC - 3 » 

- GACAGATTCTTTGAGGGAGCTGAG CTCAACGTAG - 3 * 
-TCAATATGCTGAAACGCGAGAGAAACCG-3 ' 
-GGGATTGTTAGGAAACGAAGGAACGC-3 » 

- CCACCAACAGCAGGGATACTGAAAAGATGGGG - 3 » 
-TGCAGATCTGCGTCTCCTATTCAAG-3 • 
- CGTGAACATGTGTACCCTCATCGCC - 3 • 
-TTGCACCAACAGTCAATGTCTTCAGG- 3 1 
- ACCAGAAGACATAGATTGTTGGTG C - 3 1 
-GCACCAACAGTCTATGTCTTCTGGC- 3 1 
-ATGTTTCCAGGCCCCTTCTGATGAC- 3 • 
-GCAGCAATCCTGGCATACACCATAG-3 ' 

- GGTTGACATAGTCTTAGAACATGGAAG - 3 » 
5 ' -CTTCCATGTTCTAAGACTATGTCAACC-3 1 
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SEQ. 101 
XD 



NO; 


PRIMER 






SEOUENCE 


23 


D2-1005 


35/+ 


5 


1 -GTCTTAGAACATGGAAGTTGTGTGACX3ACGATGGC-3 ■ 


24 


D2-1141 


25/+ 


5 


• -ACAACAGAATCTCGCTGCCCAACAC-3 • 


25 


D2-1211 


25/+ 


5 


' - GCAAACACTCCATGGTAGACAGAGG - 3 1 


26 


CD2-1211 


25/- 


5 


• - CCTCTGTCTACCATGGAGTGTTTGC- 3 1 


27 


CD2-1227 


27/- 


5 


' - CCACATCCATTTCCCCATCCTCTGTCT- 3 » 


28 


D2-1261 


30/+ 


5 


* -GGAAAGGGAGGCATTGTGACCTCTGCTATG - 3 » 


29 


D2-1416 


28/+ 


5 


' - GGAAATCAAAATAACACCACAGAGTTCC - 3 ' 


30 


CD2-1503 


34/- 


5 


1 -CTGOVGCAACAC«TCTCATTGAAGTCGAGGCCX:-3 ' 


31 


D2-1510 


25/+ 


5 


' -GACTTCAATGAGATGGTGCTGCTGC-3 1 


32 


CD2-1510 


25/+ 


5' 


- GCAGCAGCACCATCTCATTGAAGTC- 3 ' 


33 


D2-1546 


28/+ 


5' 


-AAGCTTGGCTGGTGCACAGGCAATGGTT-3 ' 


34 


CD2-1567 


27/- 


5' 


- TGGTAACGGCAGGTCTAGGAACCATTG- 3 ' 


35 


D2-1777 


23/+ 


5' 


-GGACATCTCAAGTGCAGGCTGAG- 3 • 


36 


CD2-1777 


23/+ 


5' 


- CTCAGCCTGCACTTGAGATGTCC- 3 ' 


37 


D2-1863 


27/+ 


5« 


-GAAGGAAATAGCAGAAACACAACATGG- 3 ' 


38 


CD2-188B 


33/- 


5' 


- CCCTTCATATTGTACTCTGATAACTATTGTTCC - 3 1 


39 


D2-2047 


32/+ 


5' 


- CCTCCATT05GAGACAGCTACATCATCATAGG - 3 ' 


40 


CD2-2047 


32/- 


5' 


- CCTATGATGATGTAGCTGTCTCCGAATGGAGG - 3 1 


41 


D2-2170 


29/+ 


5' 


-ATGGCCATTTTAGGTGACACAGCCTGGGA-3 » 


42 


CD2-2200 


27/ - 


5' 


- TGTAAACACTCCTCCCAGGGATCCAAA - 3 1 


43 


D2-2308 


29/+ 


5' 


- CTCATAGGAGTCATTATCACATGGATAGG - 3 • 


44 


CD2-2504 


35/- 


5« 


-GGGGATTCltSGTTGGAACTTATATTGTTCTGTCC- 3 1 


45 


CD2-2622 


30/- 


5' 


- TGATTCAATTCTGGTGTTATTltJ'lT.TCCAC- 3 ' 


46 


D2-2702 


25/+ 


5' 


-AAGGAATCATGCAGGCAGGAAAACG- 3 1 


47 


CD2-2864 


22/- 


5' 


- ACTTCCASCGASTTCCAAGCTC- 3 1 
A A 


48 


D2-2992 


25/+ 


5' 


-AACAGAGCCGTCCATGCCGATATGG-3 ■ 


49 


CD2-3105 


22/- 


5' 


-TCCATTGCTCCA&AGGGTGTGT-3 1 

G 


50 


D2-3236 


25/+ 


5' 


- AGCTTGAGATGGACTTTGATTTCTG - 3 ' 
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MER /SENSE SEQUENCE 



51 CD2-3410 

52 D2-3621 

53 CD2-3739 

54 D2-3905 

55 CD2-4002 

56 CD2-4060 

57 D2-4214 

58 D2-4257 

59 CD2-4323 

60 D2-4497 

61 CD2-4557 

62 CD2-4615 

63 D2-4746 

64 D2-4792 

65 CD2-4922 

66 D2-4994 

67 D2-5124 

68 D2-5173 

69 CD2-5272 

70 CD2-531B 

71 CD2-5656 

72 CD2-5B91 

73 D2-5770 

74 D2-6152 

75 CD2-6252 



22/- 
23/+ 
25/- 

25/+ 
25/- 

25/- 
25/+ 
34/+ 
24/- 

25/+ 
30/- 

* 

25/- 

25/+ 
25/+ 
25/- 

25/+ 
25/+ 
25/+ 

■ 

19/- 

25/- 
27/- 
26/- 
25/+ 
25/+ 
27/- 



- GGTCTGATTTCCATCCCGTACC - 3 » 

-GTCCTTTAGAGACCTGGGAAGAG-3 1 

-GXTTTCTCAAGAGTAGTCCAGCTGC-3 1 
C 

-ATCAATTGGCAGTGACTATCATGGC-3 1 

-TGTTAAGASCAGTGG^GAAACGGAC- 3 • 
A G 

- GATTGAGACCTTTGATCGTCAACGC - 3 » 

- TGACAGGAC CATT AGTGG CTGGAGG - 3 ' 

- CGTGCTCACTGGACGATCGGCCX^TTTGGAACTG - 3 ' 

-GGGCTGCTTCCTGATATITCTGCC-3 ' 

C 

- CCTGTGGGAAGTGAAGAAACAACGG - 3 • 

- GCHXXIATCTTCX7VGTTCAGCCITTCCCATG - 3 * 

- CTCCGG CTCC^ATCTG^GAGTATCC - 3 1 

G G A 

-CCTAATATCATATGGAGGAGGCTGG-3 1 

-GAAGGAGAAGAAGTCCAGGTATTGG-3 1 

-£TGTCGA£AATTGGAGATCCTGACG- 3 1 
T T 

- GTGGAGCATATGTGAGTGCTATAGC - 3 * 

- TCTGACTATGGCCGGAAGGTATCTC - 3 » 
-ACATTAATCTTGGCCCCCACTAGAG- 3 • 



- CGATCTCCCGCCCGGTGTG- 3 1 



-CI7VACTGGTGATAGCAGCCTCATGG-3 1 

- CCTACTGAGTTGTATCACTTTCTTTCC- 3 1 

- TGGATTTCTTCXrrATTCTCCCTCTTC - 3 1 

-TTCAAGGCTGAGAGGGTTATAGACC-3 ' 

-TCTGGTTGGCCTACAGAGTGGCAGC- 3 1 

5 ' - CCTTCTTTTGTCCAGATTTC^CTTCC - 3 1 

A 
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HQj mer/ sense sequence 

76 D2-6493 35/+ 5 ' -GCGTACAACCATGCTCTCAGTGAACTGCCGGAGAC-3 ' 

77 CD2-6605 24/- 5 1 - TTCCCAGGGTCATCTTCCCTATAC - 3 1 

G 

78 CD2-6624 31/- 5 ' -GATGCTAGCCGTGATTATGCAGCACATTCCC-3 ■ 

79 D2-6748 25/+ 5 1 -AAACAGAGAACACCCCAAGACAACC-3 1 

80 CD2-6932 21/- 5 1 -CGGCATACAGCGTCCATGCTG-3 1 

81 D2-7055 25/+ 5 • - GTCTCGGGAAAGGATGGCCATTGTC - 3 • 

82 CD2-7195 25/- 5 ' -CTCTGGITGCTTTTGCTTG^AGTQC-3 1 

A G G 

83 CD2-7217 27/- 5 ■ - CCGCCGCTCCTCTTTTCTGAGCrTCTC- 3 • 

84 D2-7378 25/+ 5 1 -AGGACTACATGGGCTCTGTGTGAGG- 3 1 

85 CD2-7515 19/- 5 ' -GAGAAGTCCAGCTCOGGCC- 3 ■ 

86 D2-7769 25/+ 5 • - AGAGAAACATGGTCACACCAGAAGG- 3 1 

87 CD2-7885 22/- 5 ' -GTTCTTCGTGTCCTGGTCCTCC- 3 • 

88 D2-8165 25/+ 5 ■ -GGAAATATGGAGGAGCCTAGTGAGG- 3 ' 

89 CD2-B210 22/- 5 1 -ACCCAGTACATCTCATGTGTGG-3 1 

90 D2-8428 28/+ 5 ' - GAG CATGAAACAT CATGGCACTATGACC - 3 ■ 

91 D2-8440 25/+ 5 1 - TCATGGCACTATGACCAAGACCACC - 3 ■ 

92 CD2-8529 22/- 5» -OVC^CCTGACCACTCOGTTCACC-3 1 

C A G 

93 D2-8773 25/+ 5 1 -AAGGTGAGAAGCAATGCAGCCTTGG-3 1 

94 D2-879B 29/+ 5 * -GGGCCATATTCACTGATGAGAACAAGTGG- 3 1 

95 CD2-8865 22/- 5 1 -2CTTTCC£TGTCAACCAGCTCC-3 1 

C T 

96 D2-9046 25/+ 5 ' -AATGAAGATCACTGGTTCTCCAGAG-3 1 

97 D2-9131 25/+ 5 1 - ACGTGAGCAAGAAAGAGGGAGGAGC- 3 ' 

98 CD2-9166 22/- 5 1 -TGTCCCATCCTGCTGTGTCATC- 3 ' 

A G 

99 CD2-9234 30/- 5 1 -GCTAGTTTCTTGTGTTCTCCTTCCATGTGG-3 ■ 

100 D2-9344 25/+ 5 ' -TCATATCGAGAAGAGACCAAAGAGG- 3 ' 

101 CD2-9429 24/- 5 ' -ACTCCTTCTCCCTCCATCTGTCTG-3 1 
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PRIMER 


MER/ SENSE 




SEOUENCE 


102 


CD2-9438 




5 


' -ATGCTTTT£AAGAITCCTTCTCCCTCC-3 ' 
A C 


103 


CD2-9468 


32/- 


5 


1 - GCACAG CGATTTCTTCTGTGATTGTTAGGTGC - 3 9 


104 


D2-9645 


25/+ 


5 


• - ACAATGGGAACCTTCAAGAGGATGG - 3 1 


105 


D2-9656.BAM 


45/+ 


5 


• - TTATCACATT/GGATCC/TTCAAGAGGATGGA 
ATGATTGGACACAAG - 3 1 

(5»-Fill/BamHI/DEN-2 Sequence) 


106 


CD2-9668 


28/- 


5 


' - C7VGAAGGGCACTTGTGTCCAATCATTCC- 3 » 


107 


CD2-9779 


21/- 


5 


• - CTCCgTGGGA^ATTCGGGCTC - 3 ' 
T G 


108 


CD2-9796 


28/- 


5 


' - CCGTCTCCCGCAAAGACCACCCTGCTCC- 3 * 


109 


CD2-9796.XBA 


44/- 


5 


• - ttatcaccta/ tctaga/ ccgtctccc 
gcaaagacc^ccctgctcc- 3 1 


110 


CD2-9913 


26/- 


5 


' -GTTGGAACCCAATGTGATGGTACTGC- 3 • 


111 


D2-9937 


25/+ 


5 


• - ACAAGTCGAACAACCTGGTCCATAC- 3 ' 


112 


CD2-9977 


21/- 


5 


1 -GCATGTCTTCCGTfiGTCATCC-3 ' 

T 


113 


CD2-10003 


25/- 


5 


1 - CTTGAATCCACACCCTGTTCCAGAC- 3 1 


114 


D2-10203 


25/+ 


5 


' - ATACACAGATTACATGCCATCCATG- 3 1 


115 


CD2-10261 


21/- 


5 


1 - TTTTGC£TTCTACCACAG£AC- 3 1 
T A 




H9- 1 09RQ 


25/- 


5 


' -GAAACAAGGCTAGAAGTCAGGTGGG-3 1 


117 


CD2- 10337 


* 23/- 


5 


' -GACGGGGCTCACAGGTAGCATAG- 3 ■ 


116 


D2-10418 


25/+ 


5 


' -GCCTGTAGCTCCACCTGAGAAGGTG-3 1 


119 


D2-10470 


25/+ 


5 


» -GGAAGCTGTACGCATGGOGTAGTGG- 3 ' 


120 


CD2-10530 


19/- 


5 


' -GGGCCCCCgTTGTTGCTGC-3 1 
A 


121 


CD2-10687 


59/- 


5 


• - AGAACCTGTTGATTCAACAGCACC^TTCCATTTTCTG- 3 ' 


122 


CD2-10687.XBA 


59/- 


5 


' -TTATCACCTA/GCATGC/TCTAGA/ 

AGAACCTGTTGATTCAACAGCACCATTCCATTTTCTG- 3 ' 

(5'-Fill/SphI/XbaI/ 
3' -End DEN -2 Sequence) 


123 


CD2-10687.X2 


52/- 


5 


, - TTATCACCTA/TCTAGA/ 



GAACCTGTTGATTCAAC7VGCACCATTCCATTTTCTG- 3 ' 

(S'-Fill/Xbal/ 
3* -End DEN- 2 Sequence) 
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While particular embodiments of the invention have 
been described in detail, it will be apparent to those 
skilled in the art that these embodiments are exemplary 
rather than limiting, and the true scope of the 
invention is that defined within the attached claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION 

(i) APPLICANT: HAH IDOL UNIVERSITY 

Bangkok, Thailand 

The United States of 

America, as represented by the Secretary, 
Department of Health and Human Services 
c/o Centers for Disease Control and 
Prevention 

Technology Transfer Office 
Mail Stop B-67 
1600 Clifton Road 
Atlanta, Georgia 30333 

(ii) TITLE OF THE INVENTION: INFECTIOUS CDNA CLONES FOR DENGUE 2 

VIRUS ... 

(iii) NUMBER OF SEQUENCES: 137 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NEEDLE & ROSENBERG, P.C. 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY: Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP: 30303 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ Version 1.5 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: U.S. Serial No. 08/483,292 

(B) FILING DATE: 7 Jun 1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Spratt, Gwendolyn D. 

(B) REGISTRATION NUMBER: 36,016 

(C) REFERENCE/DOCKET NUMBER: 14114.0179/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404-688-0770 

(B) TELEFAX: 404-688-9880 

(C) TELEX: 
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(2) INFORMATION FOR SEQ ID NO:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10723 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 97... 10269 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 

AGTTGTTAGT CTACGTGGAC CGACAAAGAC AGATTCTTTG AGGGAGCTAA GCTCAACGTA 60 
GTTCTAACAG TTTTTTAATT AGAGAGCAGA TCTCTG ATG AAT AAC CAA CGG AAA 114 

Met Asn Asn Gin Arg Lys 
1 5 

AAG GCG AAA AAC ACG CCT TTC AAT ATG CTG AAA CGC GAG AGA AAC CGC 162 
Lys Ala Lys Asn Thr Pro Phe Asn Met Leu Lys Arg Glu Arg Asn Arg 

10 15 20 

GTG TCG ACT GTG CAA CAG CTG ACA AAG AGA TTC TCA CTT GGA ATG CTG 210 
Val Ser Thr Val Gin Gin Leu Thr Lys Arg Phe Ser Leu Gly Met Leu 
25 30 35 

CAG GGA CGA GGA CCA TTA AAA CTG TTC ATG GCC CTG GTG GCG TTC CTT 258 
Gin Gly Arg Gly Pro Leu Lys Leu Phe Met Ala Leu Val Ala Phe Leu 
40 45 50 

CGT TTC CTA ACA ATC CCA CCA ACA GCA GGG ATA TTG AAG AGA TGG GGA 306 
Arg Phe Leu Thr He Pro Pro Thr Ala Gly He Leu Lys Arg Trp Gly 
55 60 65 70 

ACA ATT AAA AAA TCA AAA GCT ATT AAT GTT TTG AGA GGG TTC AGG AAA 354 
Thr He Lys Lys Ser Lys Ala He Asn Val Leu Arg Gly Phe Arg Lys 

75 80 85 

GAG ATT GGA AGG ATG CTG AAC ATC TTG AAT AGG AGA CGC AGA TCT GCA 402 
Glu He Gly Arg Met Leu Asn He Leu Asn Arg Arg Arg Arg Ser Ala 

90 95 100 

GGC ATG ATC ATT ATG CTG ATT CCA ACA GTG ATG GCG TTC CAT TTA ACC 450 

Gly Met He He Met Leu He Pro Thr Val Met Ala Phe His Leu Thr 
105 110 115 

ACA CGT AAC GGA GAA CCA CAC ATG ATC GTC AGC AGA CAA GAG AAA GGG 498 
Thr Arg Asn Gly Glu Pro His Met He Val Ser Arg Gin Glu Lys Gly 
120 125 130 
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AAA AGT CTT CTG TTT AAA ACA GAG GAT GGC GTG AAC ATG TGT ACC CTC 
Lys Ser Leu Leu Phe Lys Thr Glu Asp Gly Val Asn Met Cys Thr Leu 
135 140 145 150 

ATG GCC ATG GAC CTT GGT GAA TTG TGT GAA GAC ACA ATC ACG TAC AAG 

Met Ala Met Asp Leu Gly Glu Leu Cys Glu Asp Thr lie Thr Tyr Lys 

155 160 165 



546 



594 



TGT CCC CTT CTC AGG CAG AAT GAG CCA GAA GAC ATA GAC TGT TGG TGC 

Cys Pro Leu Leu Arg Gin Asn Glu Pro Glu Asp lie Asp Cys Trp Cys 

170 175 180 



642 



AAC TCT ACG TCC ACG TGG GTA ACT TAT GGG ACG TGT ACC ACC ATG GGA 
Asn Ser Thr Ser Thr Trp Val Thr Tyr Gly Thr Cys Thr Thr Met Gly 
185 190 195 

GAA CAT AGA AGA GAA AAA AGA TCA GTG GCA CTC GTT CCA CAT GTG GGA 
Glu His Arg Arg Glu Lys Arg Ser Val Ala Leu Val Pro His Val Gly 
200 205 210 

ATG GGA CTG GAG ACA CGA ACT GAA ACA TGG ATG TCA TCA GAA GGG GCC 
Met Gly Leu Glu Thr Arg Thr Glu Thr Trp Met Ser Ser Glu Gly Ala 
215 220 225 230 

TGG AAA CAT GTC CAG AGA ATT GAA ACT TGG ATC TTG AGA CAT CCA GGC 

Trp Lys His Val Gin Arg lie Glu Thr Trp lie Leu Arg His Pro Gly 

235 240 245 

TTC ACC ATG ATG GCA GCA ATC CTG GCA TAC ACC ATA GGA ACG ACA CAT 
Phe Thr Met Met Ala Ala He Leu Ala Tyr Thr He Gly Thr Thr His 

250 255 260 

TTC CAA AGA GCC CTG ATT TTC ATC TTA CTG ACA GCT GTC ACT CCT TCA 
Phe Gin Arg Ala Leu He Phe He Leu Leu Thr Ala Val Thr Pro Ser 
265 270 275 

ATG ACA ATG CGT TGC ATA GGA ATG TCA AAT AGA GAC TTT GTG GAA GGG 
Met Thr Met Arg Cys He Gly Met Ser Asn Arg Asp Phe Val Glu Gly 
280 285 290 

GTT TCA GGA GGA AGC TGG GTT GAC ATA GTC TTA GAA CAT GGA AGC TGT 
Val Ser Gly Gly Ser Trp Val Asp He Val Leu Glu His Gly Ser Cys 
295 300 305 310 

GTG ACG ACG ATG GCA AAA AAC AAA CCA ACA TTG GAT TTT GAA CTG ATA 

Val Thr Thr Met Ala Lys Asn Lys Pro Thr Leu Asp Phe Glu Leu lie 

315 320 325 

AAA ACA GAA GCC AAA CAG CCT GCC ACC CTA AGG AAG TAC TGT ATA GAG 
Lys Thr Glu Ala Lys Gin Pro Ala Thr Leu Arg Lys Tyr Cys He Glu 

330 335 340 

GCA AAG CTA ACC AAC ACA ACA ACA GAA TCT CGC TGC CCA ACA CAA GGG 
Ala Lys Leu Thr Asn Thr Thr Thr Glu Ser Arg Cys Pro Thr Gin Gly 
345 350 355 



690 



738 



786 



834 



882 



930 



978 



1026 



1074 



1122 



1170 
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GAA CCC AGC CTA AAT GAA GAG CAG GAC AAA AGG TTC GTC TGC AAA CAC 1218 

Glu Pro Ser Leu Asn Glu Glu Gin Asp Lys Arg Phe Val Cys Lys His 
360 365 370 

TCC ATG GTA GAC AGA GGA TGG GGA AAT GGA TGT GGA CTA TTT GGA AAG 1266 
Ser Met Val Asp Arg Gly Trp Gly Asn Gly Cys Gly Leu Phe Gly Lys 
375 380 385 390 

GGA GGC ATT GTG ACC TGT GCT ATG TTC AGA TGC AAA AAG AAC ATG GAA 1314 
Gly Gly lie Val Thr Cys Ala Met Phe Arg Cys Lys Lys Asn Met Glu 

395 400 405 



GGA AAA GTT GTG CAA CCA GAA AAC TTG GAA TAC ACC ATT GTG ATA ACA 1362 
Gly Lys Val Val Gin Pro Glu Asn Leu Glu Tyr Thr lie Val lie Thr 

410 415 420 

CCT CAC TCA GGG GAA GAG CAT GCA GTC GGA AAT GAC ACA GGA AAA CAT 1410 

Pro His Ser Gly Glu Glu His Ala Val Gly Asn Asp Thr Gly Lys His 
425 430 435 

GGC AAG GAA ATC AAA ATA ACA CCA CAG AGT TCC ATC ACA GAA GCA GAA 1458 
Gly Lys Glu lie Lys lie Thr Pro Gin Ser Ser lie Thr Glu Ala Glu 
440 445 450 

TTG ACA GGT TAT GGC ACT GTC ACA ATG GAG TGC TCT CCA AGA ACG GGC 1506 
Leu Thr Gly Tyr Gly Thr Val Thr Met Glu Cys Ser Pro Arg Thr Gly 
455 460 465 ~ 470 

CTC GAC TTC AAT GAG ATG GTG TTG CTG CAG ATG GAA AAT AAA GCT TGG 1554 

Leu Asp Phe Asn Glu Met Val Leu Leu Gin Met Glu Asn Lys Ala Trp 

475 480 485 

CTG GTG CAC AGG CAA TGG TTC CTA GAC CTG CCG TTA CCA TGG TTG CCC 1602 
Leu Val His Arg Gin Trp Phe Leu Asp Leu Pro Leu Pro Trp Leu Pro 

490 495 500 

GGA GCG GAC ACA CAA GGG TCA AAT TGG ATA CAG AAA GAG ACA TTG GTC 1650 

Gly Ala Asp Thr Gin Gly Ser Asn Trp lie Gin Lys Glu Thr Leu Val 
505 510 515 

ACT TTC AAA AAT CCC CAT GCG AAG AAA CAG GAT GTT GTT GTT TTA GGA 1698 
Thr Phe Lys Asn Pro His Ala Lys Lys Gin Asp Val Val Val Leu Gly 
520 525 530 

TCC CAA GAA GGG GCC ATG CAC ACA GCA CTT ACA GGG GCC ACA GAA ATC 1746 

Ser Gin Glu Gly Ala Met His Thr Ala Leu Thr Gly Ala Thr Glu lie 
535 540 545 550 

CAA ATG TCA TCA GGA AAC TTA CTC TTC ACA GGA CAT CTC AAG TGC AGG 1794 

Gin Met Ser Ser Gly Asn Leu Leu Phe Thr Gly His Leu Lys Cys Arg 

555 560 565 

CTG AGA ATG GAC AAG CTA CAG CTC AAA GGA ATG TCA TAC TCT ATG TGC 1842 
Leu Arg Met Asp Lys Leu Gin Leu Lys Gly Met Ser Tyr Ser Met Cys 

570 575 580 

ACA GGA AAG TTT AAA GTT GTG AAG GAA ATA GCA GAA ACA CAA CAT GGA 1890 

Thr Gly Lys Phe Lys Val Val Lys Glu He Ala Glu Thr Gin His Gly 
585 590 595 
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ACA ATA GTT ATC AGA GTG CAA TAT GAA GGG GAC GGC TCT CCA TGC AAG 1938 
Thr He Val He Arg Val Gin Tyr Glu Gly Asp Gly Ser Pro Cys Lys 
600 60S 610 

ATC CCT TTT GAG ATA ATG GAT TTG GAA AAA AGA CAT GTC TTA GGT CGC 1986 

He Pro Phe Glu He Met Asp Leu Glu Lys Arg His Val Leu Gly Arg 
615 620 625 630 

CTG ATT ACA GTC AAC CCA ATT GTG ACA GAA AAA GAT AGC CCA GTC AAC 2034 
Leu He Thr Val Asn Pro He Val Thr Glu Lys Asp Ser Pro Val Asn 

635 640 645 

ATA GAA GCA GAA CCT CCA TTC GGA GAC AGC TAC ATC ATC ATA GGA GTA 2082 

He Glu Ala Glu Pro Pro Phe Gly Asp Ser Tyr He He He Gly Val 

650 655 660 

GAG CCG GGA CAA CTG AAG CTC AAC TGG TTT AAG AAA GGA AGT TCT ATC 2130 
Glu Pro Gly Gin Leu Lys Leu Asn Trp Phe Lys Lys Gly Ser Ser He 
665 670 675 

GGC CAA ATG TTT GAG ACA ACA ATG AGG GGG GCG AAG AGA ATG GCC ATT 2178 
Gly Gin Met Phe Glu Thr Thr Met Arg Gly Ala Lys Arg Met Ala He 
680 685 690 

TTA GGT GAC ACA GCC TGG GAT TTT GGA TCC TTG GGA GGA GTG TTT ACA 2226 
Leu Gly Asp Thr Ala Trp Asp Phe Gly Ser Leu Gly Gly Val Phe Thr 
695 700 705 710 

TCT ATA GGA AAG GCT CTC CAC CAA GTC TTT GGA GCA ATC TAT GGA GCT 2274 

Ser He Gly Lys Ala Leu His Gin Val Phe Gly Ala He Tyr Gly Ala 

715 720 725 

GCC TTC AGT GGG GTT TCA TGG ACT ATG AAA ATC CTC ATA GGA GTC ATT 2322 
Ala Phe Ser Gly Val Ser Trp Thr Met Lys He Leu He Gly Val He 

730 735 740 

ATC ACA TGG ATA GGA ATG AAT TCA CGC AGC ACC TCA CTG TCT GTG ACA 2370 
He Thr Trp He Gly Met Asn Ser Arg Ser Thr Ser Leu Ser Val Thr 
745 750 755 

CTA GTA TTG GTG GGA ATT GTG ACA CTG TAT TTG GGA GTC ATG GTG CAG 2418 

Leu Val Leu Val Gly He Val Thr Leu Tyr Leu Gly Val Met Val Gin 
760 765 770 

GCC GAT AGT GGT TGC GTT GTG AGC TGG AAA AAC AAA GAA CTG AAA TGT 2466 
Ala Asp Ser Gly Cys Val Val Ser Trp Lys Asn Lys Glu Leu Lys Cys 
775 780 785 790 

GGC AGT GGG ATT TTC ATC ACA GAC AAC GTG CAC ACA TGG ACA GAA CAA 2514 
Gly Ser Gly He Phe He Thr Asp Asn Val His Thr Trp Thr Glu Gin 

795 800 805 

TAC AAG TTC CAA CCA GAA TCC CCT TCA AAA CTA GCT TCA GCT ATC CAG 2562 

Tyr Lys Phe Gin Pro Glu Ser Pro Ser Lys Leu Ala Ser Ala He Gin 

810 815 820 

AAA GCC CAT GAA GAG GGC ATT TGT GGA ATC CGC TCA GTA ACA AGA CTG 2610 
Lys Ala His Glu Glu Gly He Cys Gly He Arg Ser Val Thr Arg Leu 
825 830 835 
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GAG AAT CTG ATG TGG AAA CAA ATA ACA CCA GAA TTG AAT CAC ATT CTA 2658 
Glu Asn Leu Met Trp Lys Gin lie Thr Pro Glu Leu Asn His lie Leu 
840 845 850 

TCA GAA AAT GAG GTG AAG TTA ACT ATT ATG ACA GGA GAC ATC AAA GGA 2706 

Ser Glu Asn Glu Val Lys Leu Thr lie Met Thr Gly Asp lie Lys Gly 
855 860 865 870 

ATC ATG CAG GCA GGA AAA CGA TCT CTG CGG CCT CAG CCC ACT GAG CTG 2754 
He Met Gin Ala Gly Lys Arg Ser Leu Arg Pro Gin Pro Thr Glu Leu 

875 880 885 

AAG TAT TCA TGG AAA ACA TGG GGC AAA GCA AAA ATG CTC TCT ACA GAG 2802 
Lys Tyr Ser Trp Lys Thr Trp Gly Lys Ala Lys Met Leu Ser Thr Glu 

890 895 900 

TCT CAT AAC CAG ACC TTT CTC ATT GAT GGC CCC GAA ACA GCA GAA TGC 2850 
Ser His Asn Gin Thr Phe Leu He Asp Gly Pro Glu Thr Ala Glu Cys 
905 910 915 

CCC AAC ACA AAT AGA GCT TGG AAT TCG TTG GAA GTT GAA GAC TAT GGC 2898 
Pro Asn Thr Asn Arg Ala Trp Asn Ser Leu Glu Val Glu Asp Tyr Gly 
920 ** 925 930 

TTT GGA GTA TTC ACC ACC AAT ATA TGG CTA AAA TTG AAA GAA AAA CAG 2946 
Phe Gly Val Phe Thr Thr Asn He Trp Leu Lys Leu Lys Glu Lys Gin 
935 940 945 950 

GAT GTA TTC TGC GAC TCA AAA CTC ATG TCA GCG GCC ATA AAA GAC AAC 2994 

Asp Val Phe Cys Asp Ser Lys Leu Met Ser Ala Ala He Lys Asp Asn 

955 960 965 

AGA GCC GTC CAT GCC GAT ATG GGT TAT TGG ATA GAA AGT GCA CTC AAT 3042 
Arg Ala Val His Ala Asp Met Gly Tyr Trp He Glu Ser Ala Leu Asn 

970 975 980 

GAC ACA TGG AAG ATA GAG AAA GCC TCT TTC ATT GAA GTT AAA AAC TGC 3090 
Asp Thr Trp Lys He Glu Lys Ala Ser Phe He Glu Val Lys Asn Cys 
985 990 995 

CAC TGG CCA AAA TCA CAC ACC CTC TGG AGC AAT GGA GTG CTA GAA AGT 3138 
His Trp Pro Lys Ser His Thr Leu Trp Ser Asn Gly Val Leu Glu Ser 
1000 1005 1010 

GAG ATG ATA ATT CCA AAG AAT CTC GCT GGA CCA GTG TCT CAA CAC AAC 3186 
Glu Met He lie Pro Lys Asn Leu Ala Gly Pro Val Ser Gin His Asn 
1015 1020 1025 1030 

TAT AGA CCA GGC TAC CAT ACA CAA ATA ACA GGA CCA TGG CAT CTA GGT 3234 
Tyr Arg Pro Gly Tyr His Thr Gin He Thr Gly Pro Trp His Leu Gly 

1035 1040 1045 

AAG CTT GAG ATG GAC TTT GAT TTC TGT GAT GGA ACA ACA GTG GTA GTG 3282 

Lys Leu Glu Met Asp Phe Asp Phe Cys Asp Gly Thr Thr Val Val Val 

1050 1055 1060 

ACT GAG GAC TGC GGA AAT AGA GGA CCC TCT TTG AGA ACA ACC ACT GCC 3330 
Thr Glu Asp Cys Gly Asn Arg Gly Pro Ser Leu Arg Thr Thr Thr Ala 
1065 1070 1075 
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TCT GGA AAA CTC ATA ACA GAA TGG TGC TGC CGA TCT TGC ACA TTA CCA 3378 
Ser Gly Lys Leu He Thr Glu Trp Cys Cys Arg Ser Cys Thr Leu Pro 
1080 1085 1090 

CCG CTA AGA TAC AGA GGT GAG GAT GGG TGC TGG TAC GGG ATG GAA ATC 3426 
Pro Leu Arg Tyr Arg Gly Glu Asp Gly Cys Trp Tyr Gly Met Glu He 
1095 1100 1105 1110 

AGA CCA TTG AAG GAG AAA GAA GAG AAT TTG GTC AAC TCC TTG GTC ACA 3474 
Arg Pro Leu Lys Glu Lys Glu Glu Asn Leu Val Asn Ser Leu Val Thr 

1115 1120 1125 

GCT GGA CAT GGG CAG GTC GAC AAC TTT TCA CTA GGA GTC TTG GGA ATG 3522 
Ala Gly His Gly Gin Val Asp Asn Phe Ser Leu Gly Val Leu Gly Met 

1130 1135 1140 

GCA TTG TTC CTG GAG GAA ATG CTT AGG ACC CGA GTA GGA ACG AAA CAT 3570 
Ala Leu Phe Leu Glu Glu Met Leu Arg Thr Arg Val Gly Thr Lys His 
1145 1150 1155 

GCA ATA CTA CTA GTT GCA GTT TCT TTT GTG ACA TTG ATC ACA GGG AAC 3618 
Ala He Leu Leu Val Ala Val Ser Phe Val Thr Leu He Thr Gly Asn 
1160 1165 1170 

ATG TCC TTT AGA GAC CTG GGA AGA GTG ATG GTT ATG GTA GGC GCC ACT 3666 
Met Ser Phe Arg Asp Leu Gly Arg Val Met Val Met Val Gly Ala Thr 
1175 1180 1185 1190 

ATG ACG GAT GAC ATA GGT ATG GGC GTG ACT TAT CTT GCC CTA CTA GCA 3714 
Met Thr Asp Asp He Gly Met Gly Val Thr Tyr Leu Ala Leu Leu Ala 

1195 1200 1205 

GCC TTC AAA GTC AGA CCA ACT TTT GCA GCT GGA CTA CTC TTG AGA AAG 3762 
Ala Phe Lys Val Arg Pro Thr Phe Ala Ala Gly Leu Leu Leu Arg Lys 

1210 1215 1220 

CTG ACC TCC AAG GAA TTG ATG ATG ACT ACT ATA GGA ATT GTA CTC CTC 3810 
Leu Thr Ser Lys Glu Leu Met Met Thr Thr He Gly He Val Leu Leu 
1225 1230 1235 

TCC CAG AGC ACC ATA CCA GAG ACC ATT CTT GAG TTG ACT GAT GCG TTA 3858 

Ser Gin Ser Thr He Pro Glu Thr He Leu Glu Leu Thr Asp Ala Leu 
1240 1245 1250 

GCC TTA GGC ATG ATG GTC CTC AAA ATG GTG AGA AAT ATG GAA AAG TAT 3906 
Ala Leu Gly Met Met Val Leu Lys Met Val Arg Asn Met Glu Lys Tyr 
1255 1260 1265 1270 

CAA TTG GCA GTG ACT ATC ATG GCT ATC TTG TGC GTC CCA AAC GCA GTG 3954 
Gin Leu Ala Val Thr He Met Ala He Leu Cys Val Pro Asn Ala Val 

1275 1280 1285 

ATA TTA CAA AAC GCA TGG AAA GTG AGT TGC ACA ATA TTG GCA GTG GTG 4002 
He Leu Gin Asn Ala Trp Lys Val Ser Cys Thr He Leu Ala Val Val 

1290 1295 1300 

TCC GTT TCC CCA CTG CTC TTA ACA TCC TCA CAG CAA AAA ACA GAT TGG 4050 

Ser Val Ser Pro Leu Leu Leu Thr Ser Ser Gin Gin Lys Thr Asp Trp 
1305 1310 1315 
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ATA CCA TTA GCA TTG ACG ATC AAA GGT CTC AAT CCA ACA GCT ATT TTT 4098 

He Pro Leu Ala Leu Thr He Lys Gly Leu Asn Pro Thr Ala He Phe 
1320 1325 1330 

CTA ACA ACC CTC TCA AGA ACC AGC AAG AAA AGG AGC TGG CCA TTA AAT 4146 

Leu Thr Thr Leu Ser Arg Thr Ser Lys Lys Arg Ser Trp Pro Leu Asn 
1335 1340 1345 1350 

GAG GCT ATC ATG GCA GTC GGG ATG GTG AGC ATT TTA GCC AGT TCT CTC 4194 
Glu Ala He Met Ala Val Gly Met Val Ser He Leu Ala Ser Ser Leu 

1355 1360 1365 

CTA AAA AAT GAT ATT CCC ATG ACA GGA CCA TTA GTG GCT GGA GGG CTC 4242 
Leu Lys Asn Asp He Pro Met Thr Gly Pro Leu Val Ala Gly Gly Leu 

1370 1375 1380 

CTC ACT GTG TGC TAC GTG CTC ACT GGA CGA TCG GCC GAT TTG GAA CTG 4290 
Leu Thr Val Cys Tyr Val Leu Thr Gly Arg Ser Ala Asp Leu Glu Leu 
1385 1390 1395 

GAG AGA GCA GCC GAT GTC AAA TGG GAA GAC CAG GCA GAG ATA TCA GGA 4338 
Glu Arg Ala Ala Asp Val Lys Trp Glu Asp Gin Ala Glu He Ser Gly 
1400 1405 1410 

AGC AGT CCA ATC CTG TCA ATA ACA ATA TCA GAA GAT GGT AGC ATG TCG 4386 
Ser Ser Pro He Leu Ser He Thr He Ser Glu Asp Gly Ser Met Ser 
1415 1420 1425 1430 

ATA AAA AAT GAA GAG GAA GAA CAA ACA CTG ACC ATA CTC ATT AGA ACA 4434 
He Lys Asn Glu Glu Glu Glu Gin Thr Leu Thr He Leu He Arg Thr 

1435 1440 1445 

GGA TTG CTG GTG ATC TCA GGA CTT TTT CCT GTA TCA ATA CCA ATC ACG 4482 
Gly Leu Leu Val He Ser Gly Leu Phe Pro Val Ser lie Pro He Thr 

1450 * 1455 1460 

GCA GCA GCA TGG TAC CTG TGG GAA GTG AAG AAA CAA CGG GCC GGA GTA 4530 
Ala Ala Ala Trp Tyr Leu Trp Glu Val Lys Lys Gin Arg Ala Gly Val 
1465 1470 1475 

TTG TGG GAT GTT CCT TCA CCC CCA CCC ATG GGA AAG GCT GAA CTG GAA 4578 

Leu Trp Asp Val Pro Ser Pro Pro Pro Met Gly Lys Ala Glu Leu Glu 
1480 1485 1490 

GAT GGA GCC TAT AGA ATT AAG CAA AAA GGG ATT CTT GGA TAT TCC CAG 4626 
Asp Gly Ala Tyr Arg He Lys Gin Lys Gly He Leu Gly Tyr Ser Gin 
1495 1500 1505 1510 

ATC GGA GCC GGA GTT TAC AAA GAA GGA ACA TTC CAT ACA ATG TGG CAT 4674 
lie Gly Ala Gly Val Tyr Lys Glu Gly Thr Phe His Thr Met Trp His 

1515 1520 1525 

GTC ACA CGT GGC GCT GTT CTA ATG CAT AAA GGA AAG AGG ATT GAA CCA 4722 
Val Thr Arg Gly Ala Val Leu Met His Lys Gly Lys Arg He Glu Pro 

1530 1535 1540 

TCA TGG GCG GAC GTC AAG AAA GAC CTA ATA TCA TAT GGA GGA GGC TGG 4770 
Ser Trp Ala Asp Val Lys Lys Asp Leu He Ser Tyr Gly Gly Gly Trp 
1545 1550 1555 
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AAG TTA GAA GGA GAA TGG AAG GAA GGA GAA GAA GTC CAG GTA TTG GCA 4818 

Lys Leu Glu Gly Glu Trp Lys Glu Gly Glu Glu Val Gin Val Leu Ala 
1560 1565 1570 

CTG GAG CCT GGA AAA AAT CCA AGA GCC GTC CAA ACG AAA CCT GGT CTT 4866 
Leu Glu Pro Gly Lys Asn Pro Arg Ala Val Gin Thr Lys Pro Gly Leu 
1575 1580 1585 1590 

TTC AAA ACC AAC GCC GGA ACA ATA GGT GCT GTA TCT CTG GAC TTT TOT 4914 
Phe Lys Thr Asn Ala Gly Thr lie Gly Ala Val Ser Leu Asp Phe Ser 

1595 1600 1605 

CCT GGA ACG TCA GGA TCT CCA ATT ATC GAC AAA AAA GGA AAA GTT GTG 4962 
Pro Gly Thr Ser Gly Ser Pro lie lie Asp Lys Lys Gly Lys Val Val 

1610 1615 1620 

GGT CTT TAT GGT AAT GGT GTT GTT ACA AGG AGT GGA GCA TAT GTG AGT 5010 
Gly Leu Tyr Gly Asn Gly Val Val Thr Arg Ser Gly Ala Tyr Val Ser 
1625 1630 1635 

GCT ATA GCC CAG ACT GAA AAA AGC ATT GAA GAC AAC CCA GAG ATC GAA 5058 
Ala lie Ala Gin Thr Glu Lys Ser He Glu Asp Asn Pro Glu He Glu 
1640 1645 1650 

GAT GAC ATT TTC CGA AAG AGA AGA CTG ACC ATC ATG GAC CTC CAC CCA 5106 

Asp Asp He Phe Arg Lys Arg Arg Leu Thr He Met Asp Leu His Pro 
1655 1660 1665 1670 

GGA GCG GGA AAG ACG AAG AGA TAC CTT CCG GCC ATA GTC AGA GAA GCT 5154 

Gly Ala Gly Lys Thr Lys Arg Tyr Leu Pro Ala He Val Arg Glu Ala 

1675 1680 1685 

ATA AAA CGG GGT TTG AGA ACA TTA ATC TTG GCC CCC ACT AGA GTT GTG 5202 
He Lys Arg Gly Leu Arg Thr Leu He Leu Ala Pro Thr Arg Val Val 

1690 1695 1700 

GCA GCT GAA ATG GAG GAA GCC CTT AGA GGA CTT CCA ATA AGA TAC CAG 5250 
Ala Ala Glu Met Glu Glu Ala Leu Arg Gly Leu Pro He Arg Tyr Gin 
1705 1710 1715 

ACC CCA GCC ATC AGA GCT GAG CAC ACC GGG CGG GAG ATT GTG GAC CTA 5298 
Thr Pro Ala He Arg Ala Glu His Thr Gly Arg Glu He Val Asp Leu 
1720 1725 1730 

ATG TGT CAT GCC ACA TTT ACC ATG AGG CTG CTA TCA CCA GTT AGA GTG 5346 
Met Cys His Ala Thr Phe Thr Met Arg Leu Leu Ser Pro Val Arg Val 
1735 1740 1745 1750 

CCA AAC TAC AAC CTG ATT ATC ATG GAC GAA GCC CAT TTC ACA GAC CCA 5394 
Pro Asn Tyr Asn Leu He He Met Asp Glu Ala His Phe Thr Asp Pro 

1755 1760 1765 

GCA AGT ATA GCA GCT AGA GGA TAC ATC TCA ACT CGA GTG GAG ATG GGT 5442 
Ala Ser He Ala Ala Arg Gly Tyr life Ser Thr Arg Val Glu Met Gly 

1770 1775 1780 

GAG GCA GCT GGG ATT TTT ATG ACA GCC ACT CCC CCG GGA AGC AGA GAC 5490 
Glu Ala Ala Gly He Phe Met Thr Ala Thr Pro Pro Gly Ser Arg Asp 
1785 1790 1795 
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CCA TTT CCT CAG AGC AAT GCA CCA ATC ATA GAT GAA GAA AGA GAA ATC 5538 
Pro Phe Pro Gin Ser Asn Ala Pro lie lie Asp Glu Glu Arg Glu lie 
1800 1805 1810 

CCT GAA CGC TCG TGG AAT TCC GGA CAT GAA TGG GTC ACG GAT TTT AAA 5586 

Pro Glu Arg Ser Tip Asn Ser Gly His Glu Tip Val Thr Asp Phe Lys 
1815 1820 1825 1830 

GGG AAG ACT GTT TGG TTC GTT CCA AGT ATA AAA GCA GGA AAT GAT ATA 5634 
Gly Lys Thr Val Trp Phe Val Pro Ser lie Lys Ala Gly Asn Asp lie 

1835 1840 1845 

GCA GCT TGC CTG AGG AAA AAT GGA AAG AAA GTG ATA CAA CTC AGT AGG 5682 
Ala Ala Cys Leu Arg Lys Asn Gly Lys Lys Val lie Gin Leu Ser Arg 

1850 1855 1860 

AAG ACC TTT GAT TCT GAG TAT GTC AAG ACT AGA ACC AAT GAT TGG GAC 5730 
Lys Thr Phe Asp Ser Glu Tyr Val Lys Thr Arg Thr Asn Asp Trp Asp 
1865 1870 1875 

TTC GTG GTT ACA ACT GAC ATT TCA GAA ATG GGT GCC AAT TTC AAG GCT 5778 
Phe Val Val Thr Thr Asp lie Ser Glu Met Gly Ala Asn Phe Lys Ala 
1880 1885 1890 

GAG AGG GTT ATA GAC CCC AGA CGC TGC ATG AAA CCA GTC ATA CTA ACA 5826 
Glu Arg Val lie Asp Pro Arg Arg Cys Met Lys Pro Val lie Leu Thr 
1895 1900 1905 1910 

GAT GGT GAA GAG CGG GTG ATT CTG GCA GGA CCT ATG CCA GTG ACC CAC 5874 

Asp Gly Glu Glu Arg Val He Leu Ala Gly Pro Met Pro Val Thr His 

1915 1920 1925 

TCT AGT GCA GCA CAA AGA AGA GGG AGA ATA GGA AGA AAT CCA AAA AAT 5922 
Ser Ser Ala Ala Gin Arg Arg Gly Arg He Gly Arg Asn Pro Lys Asn 

1930 1935 1940 

GAG AAT GAC CAG TAC ATA TAC ATG GGG GAA CCT CTG GAA AAT GAT GAA 5970 

Glu Asn Asp Gin Tyr He Tyr Met Gly Glu Pro Leu Glu Asn Asp Glu 
1945 1950 1955 

GAC TGT GCA CAC TGG AAA GAA GCT AAA ATG CTC CTA GAT AAC ATC AAC 6018 

Asp Cys Ala His Trp Lys Glu Ala Lys Met Leu Leu Asp Asn He Asn 
1960 1965 1970 

ACG CCA GAA GGA ATC ATT CCT AGC ATG TTC GAA CCA GAG CGT GAA AAG 6066 
Thr Pro Glu Gly He He Pro Ser Met Phe Glu Pro Glu Arg Glu Lys 
1975 1980 1985 1990 

GTG GAT GCC ATT GAT GGC GAA TAC CGC TTG AGA GGA GAA GCA AGG AAA 6114 
Val Asp Ala He Asp Gly Glu Tyr Arg Leu Arg Gly Glu Ala Arg Lys 

1995 2000 2005 

ACC TTT GTA GAC TTA ATG AGA AGA GGA GAC CTA CCA GTC TGG TTG GCC 6162 

Thr Phe Val Asp Leu Met Arg Arg Gly Asp Leu Pro Val Trp Leu Ala 

2010 2015 2020 

TAC AGA GTG GCA GCT GAA GGC ATC AAC TAC GCA GAC AGA AGG TGG TGT 6210 
Tyr Arg Val Ala Ala Glu Gly He Asn Tyr Ala Asp Arg Arg Trp Cys 
2025 2030 2035 
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TTT GAT GGA GTC AAG AAC AAC CAA ATC CTA GAA GAA AAC GTG GAA GTT 6258 

Phe Asp Gly Val Lys Asn Asn Gin lie Leu Glu Glu Asn Val Glu Val 
2040 2045 2050 

GAA ATC TGG ACA AAA GAA GGG GAA AGG AAG AAA TTG AAA CCC AGA TGG 6306 
Glu lie Tip Thr Lys Glu Gly Glu Arg Lys Lys Leu Lys Pro Arg Trp 
2055 2060 2065 2070 

TTG GAT GCT AGG ATC TAT TCT GAC CCA CTG GCG CTA AAA GAA TTT AAG 6354 
Leu Asp Ala Arg He Tyr Ser Asp Pro Leu Ala Leu Lys Glu Phe Lys 

2075 2080 2085 

GAA TTT GCA GCC GGA AGA AAG TCT CTG ACC CTG AAC CTA ATC ACA GAA 6402 

Glu Phe Ala Ala Gly Arg Lys Ser Leu Thr Leu Asn Leu He Thr Glu 

2090 2095 2100 

ATG GGT AGG CTC CCA ACC TTC ATG ACT CAG AAG GCA AGA GAC GCA CTG 6450 

Met Gly Arg Leu Pro Thr Phe Met Thr Gin Lys Ala Arg Asp Ala Leu 
2105 2110 2115 

GAC AAC TTA GCA GTG CTG CAC ACG GCT GAG GCA GGT GGA AGG GCG TAC 6498 
Asp Asn Leu Ala Val Leu His Thr Ala Glu Ala Gly Gly Arg Ala Tyr 
2120 2125 2130 

AAC CAT GCT CTC AGT GAA CTG CCG GAG ACC CTG GAG ACA TTG CTT TTA 6546 
Asn His Ala Leu Ser Glu Leu Pro Glu Thr Leu Glu Thr Leu Leu Leu 
2135 2140 2145 2150 

CTG ACA CTT CTG GCT ACA GTC ACG GGA GGG ATC TTT TTA TTC TTG ATG 6594 
Leu Thr Leu Leu Ala Thr Val Thr Gly Gly He Phe Leu Phe Leu Met 

2155 2160 2165 

AGC GGA AGG GGC ATA GGG AAG ATG ACC CTG GGA ATG TGC TGC ATA ATC 6642 
Ser Gly Arg Gly He Gly Lys Met Thr Leu Gly Met Cys Cys He He 

2170 2175 2180 

ACG GCT AGC ATC CTC CTA TGG TAC GCA CAA ATA CAG CCA CAC TGG ATA 6690 
Thr Ala Ser He Leu Leu Trp Tyr Ala Gin He Gin Pro His Trp He 
2185 2190 2195 

GCA GCT TCA ATA ATA CTG GAG TTT TTT CTC ATA GTT TTG CTT ATT CCA 6738 

Ala Ala Ser He He Leu Glu Phe Phe Leu He Val Leu Leu He Pro 
2200 2205 2210 

GAA CCT GAA AAA CAG AGA ACA CCC CAA GAC AAC CAA CTG ACC TAC GTT 6786 
Glu Pro Glu Lys Gin Arg Thr Pro Gin Asp Asn Gin Leu Thr Tyr Val 
2215 2220 2225 2230 

GTC ATA GCC ATC CTC ACA GTG GTG GCC GCA ACC ATG GCA AAC GAG ATG 6834 

Val He Ala lie Leu Thr Val Val Ala Ala Thr Met Ala Asn Glu Met 

2235 2240 2245 

GGT TTC CTA GAA AAA ACG AAG AAA GAT CTC GGA TTG GGA AGC ATT GCA 6882 
Gly Phe Leu Glu Lys Thr Lys Lys Asp Leu Gly Leu Gly Ser He Ala 

2250 2255 2260 

ACC CAG CAA CCC GAG AGC AAC ATC CTG GAC ATA GAT CTA CGT CCT GCA 6930 
Thr Gin Gin Pro Glu Ser Asn He Leu Asp He Asp Leu Arg Pro Ala 
2265 2270 2275 
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TCA GCA TGG ACG CTG TAT GCC GTG GCC ACA ACA TTT GTT ACA CCA ATG 6978 

Ser Ala Tip Thr Leu Tyr Ala Val Ala Thr Thr Phe Val Thr Pro Met 
2280 2285 2290 

TTG AGA CAT AGC ATT GAA AAT TCC TCA GTG AAT GTG TCC CTA ACA GCT 7026 
Leu Arg His Ser lie Glu Asn Ser Ser Val Asn Val Ser Leu Thr Ala 
2295 2300 2305 2310 

ATA GCC AAC CAA GCC ACA GTG TTA ATG GGT CTC GGG AAA GGA TGG CCA 7074 
lie Ala Asn Gin Ala Thr Val Leu Met Gly Leu Gly Lys Gly Tip Pro 

2315 2320 2325 

TTG TCA AAG ATG GAC ATC GGA GTT CCC CTT CTC GCC ATT GGA TGC TAC 7122 
Leu Ser Lys Met Asp lie Gly Val Pro Leu Leu Ala lie Gly Cys Tyr 

2330 2335 2340 

TCA CAA GTC AAC CCC ATA ACT CTC ACA GCA GCT CTT TTC TTA TTG GTA 7170 
Ser Gin Val Asn Pro lie Thr Leu Thr Ala Ala Leu Phe Leu Leu Val 
2345 2350 2355 

GCA CAT TAT GCC ATC ATA GGG CCA GGA CTC CAA GCA AAA GCA ACC AGA 7218 
Ala His Tyr Ala lie lie Gly Pro Gly Leu Gin Ala Lys Ala Thr Arg 
2360 2365 2370 

GAA GCT CAG AAA AGA GCA GCG GCG GGC ATC ATG AAA AAC CCA ACT GTC 7266 

Glu Ala Gin Lys Arg Ala Ala Ala Gly He Met Lys Asn Pro Thr Val 
2375 2380 2385 2390 

GAT GGA ATA ACA GTG ATT GAC CTA GAT CCA ATA CCT TAT GAT CCA AAG 7314 
Asp Gly He Thr Val He Asp Leu Asp Pro He Pro Tyr Asp Pro Lys 

2395 2400 2405 

TTT GAA AAG CAG TTG GGA CAA GTA ATG CTC CTA GTC CTC TGC GTG ACT 7362 
Phe Glu Lys Gin Leu Gly Gin Val Met Leu Leu Val Leu Cys Val Thr 

2410 2415 2420 

CAA GTA TTG ATG ATG AGG ACT ACA TGG GCT CTG TGT GAG GCT TTA ACC 7410 
Gin Val Leu Met Met Arg Thr Thr Trp Ala Leu Cys Glu Ala Leu Thr 
2425 2430 2435 

TTA GCT ACC GGG CCC ATC TCC ACA TTG TGG GAA GGA AAT CCA GGG AGG 7458 
Leu Ala Thr Gly Pro He Ser Thr Leu Tip Glu Gly Asn Pro Gly Arg 
2440 2445 2450 

TTT TGG AAC ACT ACC ATT GCG GTG TCA ATG GCT AAC ATT TTT AGA GGG 7506 
Phe Trn Asn Thr Thr He Ala Val Ser Met Ala Asn He Phe Arg Gly 
2455 2460 2465 2470 

AGT TAC TTG GCC GGA GCT GGA CTT CTC TTT TCT ATT ATG AAG AAC ACA 7554 
Ser Tyr Leu Ala Gly Ala Gly Leu Leu Phe Ser He Met Lys Asn Thr 

2475 2480 2485 

ACC AAC ACA AGA AGG GGA ACT GGC AAC ATA GGA GAG ACG CTT GGA GAG 7602 

Thr Asn Thr Arg Arg Gly Thr Gly Asn He Gly Glu Thr Leu Gly Glu 

2490 2495 2500 

AAA TGG AAA AGC CGA TTG AAC GCA TTG GGA AAA AGT GAA TTC CAG ATC 7650 
Lys Tip Lys Ser Arg Leu Asn Ala Leu Gly Lys Ser Glu Phe Gin He 
2505 2510 2515 
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TAC AAG AAA AGT GGA ATC CAG GAA GTG GAT AGA ACC TTA GCA AAA GAA 7698 
Tyr Lys Lys Ser Gly lie Gin Glu Val Asp Arg Thr Leu Ala Lys Glu 
2520 2525 2530 

GGC ATT AAA AGA GGA GAA ACG GAC CAT CAC GCT GTG TCG CGA GGC TCA 7746 
Gly He Lys Arg Gly Glu Thr Asp His His Ala Val Ser Arg Gly Ser 
2535 2540 2545 2550 

GCA AAA CTG AGA TGG TTC GTT GAG AGA AAC ATG GTC ACA CCA GAA GGG 7794 
Ala Lys Leu Arg Trp Phe Val Glu Arg Asn Met Val Thr Pro Glu Gly 

2555 2560 2565 

AAA GTA GTG GAC CTC GGT TGT GGC AGA GGA GGC TGG TCA TAC TAT TGT 7842 
Lys Val Val Asp Leu Gly Cys Gly Arg Gly Gly Trp Ser Tyr Tyr. Cys 

2570 2575 2580 

GGA GGA CTA AAG AAT GTA AGA GAA GTC AAA GGC CTA ACA AAA GGA GGA 7890 
Gly Gly Leu Lys Asn Val Arg Glu Val Lys Gly Leu Thr Lys Gly Gly 
2585 2590 2595 

CCA GGA CAC GAA GAA CCC ATC CCC ATG TCA ACA TAT GGG TGG AAT CTA 7938 
Pro Gly His Glu Glu Pro He Pro Met Ser Thr Tyr Gly Trp Asn Leu 
2600 2605 2610 

GTG CGT CTT CAA AGT GGA GTT GAC GTT TTC TTC ATC CCG CCA GAA AAG 7986 

Val Arg Leu Gin Ser Gly Val Asp Val Phe Phe He Pro Pro Glu Lys 
2615 2620 2625 2630 

TGT GAC ACA TTA TTG TGT GAC ATA GGG GAG TCA TCA CCA AAT CCC ACA 8034 
Cys Asp Thr Leu Leu Cys Asp He Gly Glu Ser Ser Pro Asn Pro Thr 

2635 2640 2645 

GTG GAA GCA GGA CGA ACA CTC AGA GTC CTT AAC TTA GTA GAA AAT TGG 8082 
Val Glu Ala Gly Arg Thr Leu Arg Val Leu Asn Leu Val Glu Asn Trp 

2650 2655 2660 

TTG AAC AAC AAC ACT CAA TTT TGC ATA AAG GTT CTC AAC CCA TAT ATG 8130 
Leu Asn Asn Asn Thr Gin Phe Cys He Lys Val Leu Asn Pro Tyr Met 
2665 2670 2675 

CCC TCA GTC ATA GAA AAA ATG GAA GCA CTA CAA AGG AAA TAT GGA GGA 8178 
Pro Ser Val lie Glu Lys Met Glu Ala Leu Gin Arg Lys Tyr Gly Gly 
2680 2685 2690 

GCC TTA GTG AGG AAT CCA CTC TCA CGA AAC TCC ACA CAT GAG ATG . TAC 8226 
Ala Leu Val Arg Asn Pro Leu Ser Arg Asn Ser Thr His Glu Met Tyr 
2695 2700 2705 2710 

TGG GTA TCC AAT GCT TCC GGG AAC ATA GTG TCA TCA GTG AAC ATG ATT 8274 

Trp Val Ser Asn Ala Ser Gly Asn He Val Ser Ser Val Asn Met He 

2715 2720 2725 

TCA AGG ATG TTG ATC AAC AGA TTT ACA ATG AGA TAC AAG AAA GCC ACT 8322 
Ser Arg Met Leu He Asn Arg Phe Thr Met Arg Tyr Lys Lys Ala Thr 

2730 2735 2740 

TAC GAG CCG GAT GTT GAC CTC GGA AGC GGA ACC CGT AAC ATC GGG ATT 8370 
Tyr Glu Pro Asp Val Asp Leu Gly Ser Gly Thr Arg Asn lie Gly He 
2745 2750 2755 
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GAA AGT GAG ATA CCA AAC CTA GAT ATA ATT GGG AAA AGA ATA GAA AAA 8418 
Glu Ser Glu lie Pro Asn Leu Asp He He Gly Lys Arg He Glu Lys 
2760 2765 2770 

ATA AAG CAA GAG CAT GAA ACA TCA TGG CAC TAT GAC CAA GAC CAC CCA 8466 

He Lys Gin Glu His Glu Thr Ser Tip His Tyr Asp Gin Asp His Pro 
2775 2780 2785 2790 

TAC AAA ACG TGG GCA TAC CAT GGT AGC TAT GAA ACA AAA CAG ACT GGA 8514 
Tyr Lys Thr Tip Ala Tyr His Gly Ser Tyr Glu Thr Lys Gin Thr Gly 

2795 2800 2805 

TCA GCA TCA TCC ATG GTC AAC GGA GTG GTC AGG CTG CTG ACA AAA CCT 8562 

Ser Ala Ser Ser Met Val Asn Gly Val Val Arg Leu Leu Thr Lys Pro 

2810 2815 2820 

TGG GAC GTC GTC CCC ATG GTG ACA CAG ATG GCA ATG ACA GAC ACG ACT 8610 
Trp Asp Val Val Pro Met Val Thr Gin Met Ala Met Thr Asp Thr Thr 
2825 2830 2835 

CCA TTT GGA CAA CAG CGC GTT TTT AAA GAG AAA GTG GAC ACG AGA ACC 8658 
Pro Phe Gly Gin Gin Arg Val Phe Lys Glu Lys Val Asp Thr Arg Thr 
2840 2845 2850 

CAA GAA CCG AAA GAA GGC ACG AAG AAA CTA ATG AAA ATA ACA GCA GAG 8706 
Gin Glu Pro Lys Glu Gly Thr Lys Lys Leu Met Lys He Thr Ala Glu 
2855 2860 2865 2870 

TGG CTT TGG AAA GAA TTA GGG AAG AAA AAG ACA CCC AGG ATG TGC ACC 8754 
Trp Leu Tip Lys Glu Leu Gly Lys Lys Lys Thr Pro Arg Met Cys Thr 

2875 2880 2885 

AGA GAA GAA TTC ACA AGA AAG GTG AGA AGC AAT GCA GCC TTG GGG GCC 8802 
Arg Glu Glu Phe Thr Arg Lys Val Arg Ser Asn Ala Ala Leu Gly Ala 

2890 2895 2900 

ATA TTC ACT GAT GAG AAC AAG TGG AAG TCG GCA CGT GAG GCT GTT GAA 8850 
He Phe Thr Asp Glu Asn Lys Trp Lys Ser Ala Arg Glu Ala Val Glu 
2905 2910 2915 

GAT AGT AGG TTT TGG GAG CTG GTT GAC AAG GAA AGG AAT CTC CAT CTT 8898 
Asp Ser Arg Phe Trp Glu Leu Val Asp Lys Glu Arg Asn Leu His Leu 
2920 2925 2930 

GAA GGA AAG TGT GAA ACA TGT GTG TAC AAC ATG ATG GGA AAA AGA GAG 8946 
Glu Gly Lys Cys Glu Thr Cys Val Tyr Asn Met Met Gly Lys Arg Glu 
2935 2940 2945 2950 

AAG AAG CTA GGG GAA TTC GGC AAG GCA AAA GGC AGC AGA GCC ATA TGG 8994 
Lys Lys Leu Gly Glu Phe Gly Lys Ala Lys Gly Ser Arg Ala He Trp 

2955 2960 2965 

TAC ATG TGG CTT GGA GCA CGC TTC TTA GAG TTT GAA GCC CTA GGA TTC 9042 
Tyr Met Tip Leu Gly Ala Arg Phe Leu Glu Phe Glu Ala Leu Gly Phe 

2970 2975 2980 

TTA AAT GAA GAT CAC TGG TTC TCC AGA GAG AAC TCC CTG AGT GGA GTG 9090 
Leu Asn Glu Asp His Trp Phe Ser Arg Glu Asn Ser Leu Ser Gly Val 
2985 2990 ~ 2995 
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GAA GGA 6AA GGG CTG CAC AA6 CTA GGT TAC ATT CTA AGA GAC GTG AGC 9138 

Glu Gly Glu Gly Leu His Lys Leu Gly Tyr lie Leu Arg Asp Val Ser 
3000 3005 3010 

AAG AAA GAG GGA GGA GCA ATG TAT GCC GAT GAC ACC GCA GGA TGG GAT 9186 
Lys Lys Glu Gly Gly Ala Met Tyr Ala Asp Asp Thr Ala Gly Trp Asp 
3015 3020 3025 3030 

ACA AGA ATC ACA CTA GAA GAC KKA AAA AAT GAA GAA ATG GTA ACA AAC 9234 
Thr Arg He Thr Leu Glu Asp Xaa Lys Asn Glu Glu Met Val Thr Asn 

3035 3040 3045 

CAC ATG GAA GGA GAA CAC AAG AAA CTA GCC GAG GCC ATT TTC AAA CTA 9282 
His Met Glu Gly Glu His Lys Lys Leu Ala Glu Ala He Phe Lys Leu 

3050 3055 3060 

ACG TAC CAA AAC AAG GTG GTG CGT GTG CAA AGA CCA ACA CCA AGA GGC 9330 
Thr Tyr Gin Asn Lys Val Val Arg Val Gin Arg Pro Thr Pro Arg Gly 
3065 3070 3075 

ACA GTA ATG GAC ATC ATA TCG AGA AGA GAC CAA AGA GGT AGT GGA CAA 9378 
Thr Val Met Asp He He Ser Arg Arg Asp Gin Arg Gly Ser Gly Gin 
3080 3085 3090 

GTT GGC ACC TAT GGA CTC AAT ACT TTC ACC AAT ATG GAA GCC CAA CTA 9426 
Val Gly Thr Tyr Gly Leu Asn Thr Phe Thr Asn Met Glu Ala Gin Leu 
3095 3100 3105 3110 

ATC AGA CAG ATG GAG GGA GAA GGA GTC TTT AAA AGC ATT CAG CAC CTA 9474 

He Arg Gin Met Glu Gly Glu Gly Val Phe Lys Ser He Gin His Leu 

3115 3120 3125 

ACA ATC ACA GAA GAA ATC GCT GTG CAA AAC TGG TTA GCA AGA GTG GGG 9522 
Thr He Thr Glu Glu He Ala Val Gin Asn Trp Leu Ala Arg Val Gly 

3130 3135 3140 

CGC GAA AGG TTA TCA AGA ATG GCC ATC AGT GGA GAT GAT TGT GTT GTG 9570 
Arg Glu Arg Leu Ser Arg Met Ala He Ser Gly Asp Asp Cys Val Val 
3145 3150 3155 

AAA CCT TTA GAT GAC AGG TTC GCA AGC GCT TTA ACA GCT CTA AAT GAC 9618 

Lys Pro Leu Asp Asp Arg Phe Ala Ser Ala Leu Thr Ala Leu Asn Asp 
3160 3165 3170 

ATG GGA AAG ATT AGG AAA GAC ATA CAA CAA TGG GAA CCT TCA AGA GGA 9666 

Met Gly Lys He Arg Lys Asp He Gin Gin Trp Glu Pro Ser Arg Gly 
3175 3180 3185 3190 

TGG AAT GAT TGG ACA CAA GTG CCC TTC TGT TCA CAC CAT TTC CAT GAG 9714 
Trp Asn Asp Trp Thr Gin Val Pro Phe Cys Ser His His Phe His Glu 

3195 3200 3205 

TTA ATC ATG AAA GAC GGT CGC GTA CTC GTT GTT CCA TGT AGA AAC CAA 9762 
Leu He Met Lys Asp Gly Arg Val Leu Val Val Pro Cys Arg Asn Gin 

3210 3215 3220 

GAT GAA CTG ATT GGC AGA GCC CGA ATC TCC CAA GGA GCA GGG TGG TCT 9810 
Asp Glu Leu He Gly Arg Ala Arg He Ser Gin Gly Ala Gly Trp Ser 
3225 3230 3235 
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TTG CGG GAG ACG GCC TGT TTG GGG AAG TCT TAC GCC CAA ATG TGG AGC 9858 
Leu Arg Glu Thr Ala Cys Leu Gly Lys Ser Tyr Ala Gin Met Trp Ser 
3240 3245 3250 

TTG ATG TAC TTC CAC AGA CGC GAC CTC AGG CTG GCG GCA AAT GCT ATT 9906 
Leu Met Tyr Phe His Arg Arg Asp Leu Arg Leu Ala Ala Asn Ala lie 
3255 3260 3265 3270 

TGC TCG GCA GTA CCA TCA CAT TGG GTT CCA ACA AGT CGA ACA ACC TGG 9954 
Cys Ser Ala Val Pro Ser His Trp Val Pro Thr Ser Arg Thr Thr Trp 

3275 3280 3285 

TCC ATA CAT GCT AAA CAT GAA TGG ATG ACA ACG GAA GAC ATG CTG ACA 10002 
Ser lie His Ala Lys His Glu Trp Met Thr Thr Glu Asp Met Leu Thr 

3290 3295 3300 

GTC TGG AAC AGG GTG TGG ATT CAA GAA AAC CCA TGG ATG GAA GAC AAA 10050 
Val Trp Asn Arg Val Trp lie Gin Glu Asn Pro Trp Met Glu Asp Lys 
3305 3310 3315 

ACT CCA GTG GAA TCA TGG GAG GAA ATC CCA TAC TTG GGG AAA AGA GAA 10098 
Thr Pro Val Glu Ser Trp Glu Glu lie Pro Tyr Leu Gly Lys Arg Glu 
3320 3325 3330 

GAC CAA TGG TGC GGC TCA TTG ATT GGG TTA ACA AGC AGG GCC ACC TGG 10146 
Asp Gin Trp Cys Gly Ser Leu lie Gly Leu Thr Ser Arg Ala Thr Trp 
3335 3340 3345 3350 

GCA AAG AAC ATC CAA GCA GCA ATA AAT CAA GTT AGA TCC CTT ATA GGC 10194 
Ala Lys Asn lie Gin Ala Ala lie Asn Gin Val Arg Ser Leu lie Gly 

3355 3360 3365 

AAT GAA GAA TAC ACA GAT TAC ATG CCA TCC ATG AAA AGA TTC AGA AGA 10242 
Asn Glu Glu Tyr Thr Asp Tyr Met Pro Ser Met Lys Arg Phe Arg Arg 

3370 3375 3380 

GAA GAG GAA GAA GCA GGA GTT CTG TGG TAGAAAGCAA AACTAACATG AAACAAGG 10297 

Glu Glu Glu Glu Ala Gly Val Leu Trp 

3385 3390 

CTAGAAGTCA GGTCGGATTA AGCCATAGTA CGGAAAAAAC TATGCTACCT GTGAGCCCCG 10357 

TCCAAGGACG TTAAAAGAAG TCAGGCCATC ATAAATGCCA TAGCTTGAGT AAACTATGCA 10417 

GCCTGTAGCT CCACCTGAGA AGGTGTAAAA AATCCGGGAG GCCACAAACC ATGGAAGCTG 10477 

TACGCATGGC GTAGTGGACT AGCGGTTAGA GAGGACCCCT CCCTTACAAA TCGCAGCAAC 10537 

AATGGGGGCC CAAGGCGAGA TGAAGCTGTA GTCTCGCTGG AAGGACTAGA GGTTAGAGGA 10597 

GACCCCCCCG AAACAAAAAA CAGCATATTG ACGCTGGGAA AGACCAGAGA TCCTGCTGTC 10657 

TCCTCAGCAT CATTCCAGGC ACAGAACGCC AGAAAATGGA ATGGTGCTGT TGAATCAACA 10717 

GGTTCT 10723 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10723 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION : 97... 10269 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AGTTGTTAGT CTACGTGGAC CGACAAAGAC AGATTCTTTG AGGGAGCTAA GCTCAATGTA 60 
GTTCTAACAG TTTTTTAATT AGAGAGCAGA TCTCTG ATG AAT AAC CAA CGG AAA 114 

Met Asn Asn Gin Arg Lys 
1 5 

AAG GCG AAA AAC ACG CCT TTC AAT ATG CTG AAA CGC GAG AGA AAC CGC 162 
Lys Ala Lys Asn Thr Pro Phe Asn Met Leu Lys Arg Glu Arg Asn Arg 

10 15 20 

GTG TCG ACT GTG CAA CAG CTG ACA AAG AGA TTC TCA CTT GGA ATG CTG 210 
Val Ser Thr Val Gin Gin Leu Thr Lys Arg Phe Ser Leu Gly Met Leu 
25 30 35 

CAG GGA CGA GGA CCA TTA AAA CTG TTC ATG GCC CTG GTG GCG TTC CTT 258 
Gin Gly Arg Gly Pro Leu Lys Leu Phe Met Ala Leu Val Ala Phe Leu 
40 45 50 

CGT TTC CTA ACA ATC CCA CCA ACA GCA GGG ATA TTG AAG AGA TGG GGA 306 
Arg Phe Leu Thr lie Pro Pro Thr Ala Gly lie Leu Lys Arg Tip Gly 
55 60 65 70 

ACA ATT AAA AAA TCA AAA GCT ATT AAT GTT TTG AGA GGG TTC AGG AAA 354 

Thr lie Lys Lys Ser Lys Ala lie Asn Val Leu Arg Gly Phe Arg Lys 

75 80 85 

GAG ATT GGA AGG ATG CTG AAC ATC TTG AAT AGG AGA CGC AGA TCT GCA 402 
Glu He Gly Arg Met Leu Asn He Leu Asn Arg Arg Arg Arg Ser Ala 

90 95 100 

GGC ATG ATC ATT ATG CTG ATT CCA ACA GTG ATG GCG TTC CAT TTA ACC 450 
Gly Met He He Met Leu He Pro Thr Val Met Ala Phe His Leu Thr 
105 110 115 

ACA CGT AAC GGA GAA CCA CAC ATG ATC GTC AGC AGA CAA GAG AAA GGG 498 
Thr Arg Asn Gly Glu Pro His Met He Val Ser Arg Gin Glu Lys Gly 
120 125 130 

AAA AGT CTT CTG TTT AAA ACA GAG GTT GGC GTG AAC ATG TGT ACC CTC 546 
Lys Ser Leu Leu Phe Lys Thr Glu Val Gly Val Asn Met Cys Thr Leu 
135 140 145 150 

ATG GCC ATG GAC CTT GGT GAA TTG TGT GAA GAC ACA ATC ACG TAC AAG 594 
Met Ala Met Asp Leu Gly Glu Leu Cys Glu Asp Thr He Thr Tyr Lys 

155 160 165 

TGT CCC CTT CTC AGG CAG AAT GAG CCA GAA GAC ATA GAC TGT TGG TGC 642 
Cys Pro Leu Leu Arg Gin Asn Glu Pro Glu Asp He Asp Cys Trp Cys 

170 175 180 

NAC TCT ACG TCC ACG TGG GTA ACT TAT GGG ACG TGT ACC ACC ATG GGA 690 
Xaa Ser Thr Ser Thr Trp Val Thr Tyr Gly Thr Cys Thr Thr Met Gly 
185 190 195 
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GAA CAT AGA AGA GAA AAA AGA TCA GTG GCA CTC GTT CCA CAT GTG GGA 738 
Glu His Arg Arg Glu Lys Arg Ser Val Ala Leu Val Pro His Val Gly 
200 205 210 

ATG GGA CTG GAG ACA CGA ACT GAA ACA TGG ATG TCA TCA GAA GGG GCC 786 

Met Gly Leu Glu Thr Arg Thr Glu Thr Trp Met Ser Ser Glu Gly Ala 
215 220 225 230 



TGG AAA CAT GTC CAG AGA ATT GAA ACT TGG ATC TTG AGA CAT CCA GGC 
Trp Lys His Val Gin Arg He Glu Thr Trp He Leu Arg His Pro Gly 

235 240 245 

TTC ACC ATG ATG GCA GCA ATC CTG GCA TAC ACC ATA GGA ACG ACA CAT 
Phe Thr Met Met Ala Ala He Leu Ala Tyr Thr He Gly Thr Thr His 

250 255 260 

TTC CAA AGA GCC CTG ATT TTC ATC TTA CTG ACA GCT GTC ACT CCT TCA 

Phe Gin Arg Ala Leu He Phe He Leu Leu Thr Ala Val Thr Pro Ser 
265 270 275 

ATG ACA ATG CGT TGC ATA GGA ATG TCA AAT AGA GAC TTT GTG GAA GGG 
Met Thr Met Arg Cys He Gly Met Ser Asn Arg Asp Phe Val Glu Gly 
280 285 290 

GTT TCA GGA GGA AGC TGG GTT GAC ATA GTC TTA GAA CAT GGA AGC TGT 
Val Ser Gly Gly Ser Trp Val Asp He Val Leu Glu His Gly Ser Cys 
295 300 305 * 310 

GTG ACG ACG ATG GCA AAA AAC AAA CCA ACA TTG GAT TTT GAA CTG ATA 

Val Thr Thr Met Ala Lys Asn Lys Pro Thr Leu Asp Phe Glu Leu He 

315 320 325 

AAA ACA GAA GCC AAA CAG CCT GCC ACC CTA AGG AAG TAC TGT ATA GAG 

Lys Thr Glu Ala Lys Gin Pro Ala Thr Leu Arg Lys Tyr Cys He Glu 

330 335 340 

GCA AAG CTA ACC NAC ACA ACA ACA GAA TCT CGC TGC CCA ACA CAA GGG 

Ala Lys Leu Thr Xaa Thr Thr Thr Glu Ser Arg Cys Pro Thr Gin Gly 
345 350 355 

GAA CCC AGC CTA AAT GAA GAG CAG GAC AAA AGG TTC GTC TGC AAA CAC 
Glu Pro Ser Leu Asn Glu Glu Gin Asp Lys Arg Phe Val Cys Lys His 
360 365 370 

TCC ATG GTA GAC AGA GGA TGG GGA AAT GGA TGT GGA CTA TTT GGA AAG 
Ser Met Val Asp Arg Gly Trp Gly Asn Gly Cys Gly Leu Phe Gly Lys 
375 380 385 390 

GGA GGC ATT GTG ACC TGT GCT ATG TTC AGA TGC AAA AAG AAC ATG GAA 
Gly Gly He Val Thr Cys Ala Met Phe Arg Cys Lys Lys Asn Met Glu 

395 400 405 

GGA AAA GTT GTG CAA CCA GAA AAC TTG GAA TAC ACC ATT GTG ATA ACA 
Gly Lys Val Val Gin Pro Glu Asn Leu Glu Tyr Thr He Val He Thr 

410 415 420 



834 



882 



930 



978 



1026 



1074 



1122 



1170 



1218 



1266 



1314 



1362 



CCT CAC TCA GGG GAA GAG CAT GCA GTC GGA NAT GAC ACA GGA AAA CAT 
Pro His Ser Gly Glu Glu His Ala Val Gly Xaa Asp Thr Gly Lys His 
425 430 435 



1410 
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GGC AAG GAA ATC AAA ATA ACA CCA CAG AGT TCC ATC ACA GAA GCA GAA 1458 

Gly Lys Glu lie Lys lie Thr Pro Gin Ser Ser lie Thr Glu Ala Glu 
440 445 450 

TTG ACA GGT TAT GGC ACT GTC ACA ATG GAG TGC TCT CCA AGA ACG GGC 1506 
Leu Thr Gly Tyr Gly Thr Val Thr Met Glu Cys Ser Pro Arg Thr Gly 
455 460 465 470 

CTC GAC TTC AAT GAG ATG GTG TTG CTG CAG ATG GAA AAT AAA GCT TGG 1554 
Leu Asp Phe Asn Glu Met Val Leu Leu Gin Met Glu Asn Lys Ala Tip 

475 480 485 

CTG GTG CAC AGG CAA TGG TTC CTA GAC CTG CCG TTA CCA TGG TTG CCC 1602 
Leu Val His Arg Gin Trp Phe Leu Asp Leu Pro Leu Pro Trp Leu Pro 

490 495 500 

GGA GCG GAC ACA CAA GGG TCA AAT TGG ATA CAG AAA GAG ACA TTG GTC 1650 
Gly Ala Asp Thr Gin Gly Ser Asn Trp He Gin Lys Glu Thr Leu Val 
505 510 515 

ACT TTC AAA AAT CCC CAT GCG AAG AAA CAG GAT GTT GTT GTT TTA GGA 1698 
Thr Phe Lys Asn Pro His Ala Lys Lys Gin Asp Val Val Val Leu Gly 
520 525 530 

TCC CAA GAA GGG GCC ATG CAC ACA GCA CTT ACA GGG GCC ACA GAA ATC 1746 
Ser Gin Glu Gly Ala Met His Thr Ala Leu Thr Gly Ala Thr Glu He 
535 540 545 550 

CAA ATG TCA TCA GGA AAC TTA CTC TTC ACA GGA CAT CTC AAG TGC AGG 1794 
Gin Met Ser Ser Gly Asn Leu Leu Phe Thr Gly His Leu Lys Cys Arg 

555 560 565 

CTG AGA ATG GAC AAG CTA CAG CTC AAA GGA ATG TCA TAC TCT ATG TGC 1842 
Leu Arg Met Asp Lys Leu Gin Leu Lys Gly Met Ser Tyr Ser Met Cys 

570 * 575 580 

ACA GGA AAG TTT AAA GTT GTG AAG GAA ATA GCA GAA ACA CAA CAT GGA 1890 
Thr Gly Lys Phe Lys Val Val Lys Glu He Ala Glu Thr Gin His Gly 
585 590 595 

ACA ATA GTT ATC AGA GTG CAA TAT GAA GGG GAC GGC TCT CCA TGC AAG 1938 

Thr He Val He Arg Val Gin Tyr Glu Gly Asp Gly Ser Pro Cys Lys 
600 605 610 

ATC CCT TTT GAG ATA ATG GAT TTG GAA AAA AGA CAT GTC TTA GGT CGC 1986 
He Pro Phe Glu He Met Asp Leu Glu Lys Arg His Val Leu Gly Arg 
615 620 625 630 

CTG ATT ACA GTC AAC CCA ATT GTG ACA GAA AAA GAT AGC CCA GTC AAC 2034 
Leu He Thr Val Asn Pro He Val Thr Glu Lys Asp Ser Pro Val Asn 

635 640 645 

ATA GAA GCA GAA CCT CCA TTT GGA GAC AGC TAC ATC ATC ATA GGA GTA 2082 
He Glu Ala Glu Pro Pro Phe Gly Asp Ser Tyr He He He Gly Val 

650 655 660 

GAG CCG GGA CAA CTG AAG CTC AAC TGG TTT AAG AAA GGA AGT TCT ATC 2130 

Glu Pro Gly Gin Leu Lys Leu Asn Trp Phe Lys Lys Gly Ser Ser He 
665 670 675 
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GGC CAA ATG TTT GAG ACA ACA ATG AGG GGG GCG AAG AGA ATG GCC ATT 2178 
Gly Gin Met Phe Glu Thr Thr Met Arg Gly Ala Lys Arg Met Ala He 
680 685 690 

TTA GGT GAC ACA GCC TGG GAT TTT GGA TCC TTG GGA GGA GTG TTT ACA 2226 
Leu Gly Asp Thr Ala Trp Asp Phe Gly Ser Leu Gly Gly Val Phe Thr 
695 700 705 710 

TCT ATA GGA AAG GCT CTC CAC CAA GTC TTT GGA GCA ATC TAT GGA GCT 2274 

Ser He Gly Lys Ala Leu His Gin Val Phe Gly Ala He Tyr Gly Ala 

715 720 725 

GCC TTC AGT GGG GTT TCA TGG ACT ATG AAA ATC CTC ATA GGA GTC ATT 2322 
Ala Phe Ser Gly Val Ser Trp Thr Met Lys He Leu He Gly Val He 

730 735 740 

ATC ACA TGG ATA GGA ATG AAT TCA CGC AGC ACC TCA CTG TCT GTG ACA 2370 
He Thr Trp He Gly Met Asn Ser Arg Ser Thr Ser Leu Ser Val Thr 
745 750 755 

CTA GTA TTG GTG GGA ATT GTG ACA CTG TAT TTG GGA GTC ATG GTG CAG 2418 
Leu Val Leu Val Gly He Val Thr Leu Tyr Leu Gly Val Met Val Gin 
760 765 770 

GCC GAT AGT GGT TGC GTT GTG AGC TGG AAA AAC AAA GAA CTG AAA TGT 2466 

Ala Asp Ser Gly Cys Val Val Ser Trp Lys Asn Lys Glu Leu Lys Cys 
775 780 785 790 

GGC AGT GGG ATT TTC ATC ACA GAC AAC GTG CAC ACA TGG ACA GAA CAA 2514 
Gly Ser Gly He Phe He Thr Asp Asn Val His Thr Trp Thr Glu Gin 

795 800 805 

TAC AAG TTC CAA CCA GAA TCC CCT TCA AAA CTA GCT TCA GCT ATC CAG 2562 
Tyr Lys Phe Gin Pro Glu Ser Pro Ser Lys Leu Ala Ser Ala He Gin 

810 815 820 

AAA GCC CAT GAA GAG GAC ATT TGT GGA ATC CGC TCA GTA ACA AGA CTG 2610 

Lys Ala His Glu Glu Asp He Cys Gly He Arg Ser Val Thr Arg Leu 
825 830 835 

GAG AAT CTG ATG TGG AAA CAA ATA ACA CCA GAA TTG AAT CAC ATT CTA 2658 
Glu Asn Leu Met Trp Lys Gin He Thr Pro Glu Leu Asn His He Leu 
840 " 845 850 

TCA GAA AAT GAG GTG AAG TTA ACT ATT ATG ACA GGA GAC ATC AAA GGA 2706 
Ser Glu Asn Glu Val Lys Leu Thr He Met Thr Gly Asp He Lys Gly 
855 860 865 870 

ATC ATG CAG GCA GGA AAA CGA TCT CTG CGG CCT CAG CCC ACT GAG CTG 2754 
He Met Gin Ala Gly Lys Arg Ser Leu Arg Pro Gin Pro Thr Glu Leu 

875 880 885 

AAG TAT TCA TGG AAA ACA TGG GGC AAA GCA AAA ATG CTC TCT ACA GAG 2802 
Lys Tyr Ser Trp Lys Thr Trp Gly Lys Ala Lys Met Leu Ser Thr Glu 

890 " 895 900 

TCT CAT NAC CAG ACC TTT CTC ATT GAT GGC CCC GAA ACA GCA GAA TGC 2850 

Ser His Xaa Gin Thr Phe Leu He Asp Gly Pro Glu Thr Ala Glu Cys 
905 910 915 
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CCC AAC ACA AAT AGA GCT TGG AAT TCG TTG GAA GTT GAA GAC TAT GGC 2898 
Pro Asn Thr Asn Arg Ala Trp Asn Ser Leu Glu Val Glu Asp Tyr Gly 
920 925 930 

TTT GGA GTA TTC ACC ACC AAT ATA TGG CTA AAA TTG AAA GAA AAA CAG 2946 
Phe Gly Val Phe Thr Thr Asn lie Trp Leu Lys Leu Lys Glu Lys Gin 
935 940 945 950 

GAT GTA TTC TGC GAC TCA AAA CTC ATG TCA GCG GCC ATA AAA GAC AAC 2994 

Asp Val Phe Cys Asp Ser Lys Leu Met Ser Ala Ala lie Lys Asp Asn 

955 960 965 

AGA GCC GTC CAT GCC GAT ATG GGT TAT TGG ATA GAA AGT GCA CTC NAT 3042 
Arg Ala Val His Ala Asp Met Gly Tyr Trp He Glu Ser Ala Leu Xaa 

970 975 980 

GAC ACA TGG AAG ATA GAG AAA GCC TCT TTC ATT GAA GTT AAA AAC TGC 3090 
Asp Thr Trp Lys He Glu Lys Ala Ser Phe He Glu Val Lys Asn Cys 
985 990 995 

CAC TGG CCA AAA TCA CAC ACC CTC TGG AGC AAT GGA GTG CTA GAA AGT 3138 
His Trp Pro Lys Ser His Thr Leu Trp Ser Asn Gly Val Leu Glu Ser 
1000 1005 1010 

GAG ATG ATA ATT CCA AAG AAT CTC GCT GGA CCA GTG TCT CAA CAC AAC 3186 

Glu Met He He Pro Lys Asn Leu Ala Gly Pro Val Ser Gin His Asn 
1015 1020 1025 1030 

TAT AGA CCA GGC TAC CAT ACA CAA ATA ACA GGA CCA TGG CAT CTA GGT 3234 
Tyr Arg Pro Gly Tyr His Thr Gin He Thr Gly Pro Trp His Leu Gly 

1035 1040 1045 

AAG CTT GAG ATG GAC TTT GAT TTC TGT GAT GGA ACA ACA GTG GTA GTG 3282 
Lys Leu Glu Met Asp Phe Asp Phe Cys Asp Gly Thr Thr Val Val Val 

1050 1055 1060 

ACT GAG GAC TGC GGA AAT AGA GGA CCC TCT TTG AGA ACA ACC ACT GCC 3330 
Thr Glu Asp Cys Gly Asn Arg Gly Pro Ser Leu Arg Thr Thr Thr Ala 
1065 1070 1075 

TCT GGA AAA CTC ATA ACA GAA TGG TGC TGC CGA TCT TGC ACA TTA CCA 3378 
Ser Gly Lys Leu He Thr Glu Trp Cys Cys Arg Ser Cys Thr Leu Pro 
1080 1085 1090 

CCG CTA AGA TAC AGA GGT GAG GAT GGG TGC TGG TAC GGG ATG GAA ATC 3426 
Pro Leu Arg Tyr Arg Gly Glu Asp Gly Cys Trp Tyr Gly Met Glu He 
1095 1100 1105 1110 

AGA CCA TTG AAG GAG AAA GAA GAG AAT TTG GTC AAC TCC TTG GTC ACA 3474 
Arg Pro Leu Lys Glu Lys Glu Glu Asn Leu Val Asn Ser Leu Val Thr 

1115 1120 1125 

GCT GGA CAT GGG CAG GTC GAC AAC TTT TCA CTA GGA GTC TTG GGA ATG 3522 
Ala Gly His Gly Gin Val Asp Asn Phe Ser Leu Gly Val Leu Gly Met 

1130 ' 1135 1140 

GCA TTG TTC CTG GAG GAA ATG CTT AGG ACC CGA GTA GGA ACG AAA CAT 3570 
Ala Leu Phe Leu Glu Glu Met Leu Arg Thr Arg Val Gly Thr Lys His 
1145 1150 1155 
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GCA ATA CTA CTA GTT GCA GTT TCT TTT GTG ACA TTG ATC ACA GGG AAC 3618 
Ala lie Leu Leu Val Ala Val Ser Phe Val Thr Leu lie Thr Gly Asn 
1160 1165 1170 

ATG TCC TTT AGA GAC CTG GGA AGA GTG ATG GTT ATG GTA GGC GCC ACT 3666 

Met Ser Phe Arg Asp Leu Gly Arg Val Met Val Met Val Gly Ala Thr 
1175 1180 1185 1190 

ATG ACG GAT GAC ATA GGT ATG GGC GTG ACT TAT CTT GCC CTA CTA GCA 3714 
Met Thr Asp Asp He Gly Met Gly Val Thr Tyr Leu Ala Leu Leu Ala 

1195 1200 1205 

GCC TTC AAA GTC AGA CCA ACT TTT GCA GCT GGA CTA CTC TTG AGA AAG 3762 
Ala Phe Lys Val Arg Pro Thr Phe Ala Ala Gly Leu Leu Leu Arg Lys 

1210 1215 1220 

CTG ACC TCC AAG GAA TTG ATG ATG ACT ACT ATA GGA ATT GTA CTC CTC 3810 

Leu Thr Ser Lys Glu Leu Met Met Thr Thr He Gly He Val Leu Leu 
1225 1230 1235 

TCC CAG AGC ACC ATA CCA GAG ACC ATT CTT GAG TTG ACT GAT GCG TTA 3858 
Ser Gin Ser Thr He Pro Glu Thr He Leu Glu Leu Thr Asp Ala Leu 
1240 1245 1250 

GCC TTA GGC ATG ATG GTC CTC AAA ATG GTG AGA AAT ATG GAA AAG TAT 3906 

Ala Leu Gly Met Met Val Leu Lys Met Val Arg Asn Met Glu Lys Tyr 
1255 1260 1265 1270 

CAA TTG GCA GTG ACT ATC ATG GCT ATC TTG TGC GTC CCA AAC GCA GTG 3954 
Gin Leu Ala Val Thr He Met Ala He Leu Cys Val Pro Asn Ala Val 

1275 1280 1285 

ATA TTA CAA AAC GCA TGG AAA GTG AGT TGC ACA ATA TTG GCA GTG GTG 4002 
He Leu Gin Asn Ala Trp Lys Val Ser Cys Thr He Leu Ala Val Val 

1290 1295 1300 

TCC GTT TCC CCA CTG TTC TTA ACA TCC TCA CAG CAA AAA ACA GAT TGG 4050 
Ser Val Ser Pro Leu Phe Leu Thr Ser Ser Gin Gin Lys Thr Asp Trp 
1305 1310 1315 

ATA CCA TTA GCA TTG ACG ATC AAA GGT CTC AAT CCA ACA GCT ATT TTT 4098 
He Pro Leu Ala Leu Thr He Lys Gly Leu Asn Pro Thr Ala He Phe 
1320 1325 1330 

CTA ACA ACC CTC TCA AGA ACC AGC AAG AAA AGG AGC TGG CCA TTA AAT 4146 
Leu Thr Thr Leu Ser Arg Thr Ser Lys Lys Arg Ser Trp Pro Leu Asn 
1335 1340 1345 1350 

GAG GCT ATC ATG GCA GTC GGG ATG GTG AGC ATT TTA GCC AGT TCT CTC 4194 
Glu Ala He Met Ala Val Gly Met Val Ser He Leu Ala Ser Ser Leu 

1355 1360 1365 

CTA AAA AAT GAT ATT CCC ATG ACA GGA CCA TTA GTG GCT GGA GGG CTC 4242 
Leu Lys Asn Asp He Pro Met Thr Gly Pro Leu Val Ala Gly Gly Leu 

1370 1375 1380 

CTC ACT GTG TGC TAC GTG CTC ACT GGA CGA TCG GCC GAT TTG GAA CTG 4290 
Leu Thr Val Cys Tyr Val Leu Thr Gly Arg Ser Ala Asp Leu Glu Leu 
1385 1390 1395 
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GAG AGA GCA GCC GAT GTC AAA TGG GAA GAC CAG GCA GAG ATA TCA GGA 4338 
Glu Arg Ala Ala Asp Val Lys Trp Glu Asp Gin Ala Glu lie Ser Gly 
1400 1405 1410 

AGC AGT CCA ATC CTG TCA ATA ACA ATA TCA GAA GAT GGT AGC ATG TCG 4386 

Ser Ser Pro He Leu Ser He Thr He Ser Glu Asp Gly Ser Met Ser 
1415 1420 1425 1430 

ATA AAA AAT GAA GAG GAA GAA CAA ACA CTG ACC ATA CTC ATT AGA ACA 4434 
He Lys Asn Glu Glu Glu Glu Gin Thr Leu Thr He Leu He Arg Thr 

1435 1440 1445 

GGA TTG CTG GTG ATC TCA GGA CTT TTT CCT GTA TCA ATA CCA ATC ACG 4482 
Gly Leu Leu Val He Ser Gly Leu Phe Pro Val Ser He Pro He Thr 

1450 1455 1460 

GCA GCA GCA TGG TAC CTG TGG GAA GTG AAG AAA CAA CGG GCC GGA GTA 4530 
Ala Ala Ala Trp Tyr Leu Trp Glu Val Lys Lys Gin Arg Ala Gly Val 
1465 1470 1475 

TTG TGG GAT GTT CCT TCA CCC CCA CCC ATG GGA AAG GCT GAA CTG GAA 4578 

Leu Trp Asp Val Pro Ser Pro Pro Pro Met Gly Lys Ala Glu Leu Glu 
1480 1485 1490 

GAT GGA GCC TAT AGA ATT AAG CAA AAA GGG ATT CTT GGA TAT TCC CAG 4626 
Asp Gly Ala Tyr Arg He Lys Gin Lys Gly He Leu Gly Tyr Ser Gin 
1495 * 1500 1505 1510 

ATC GGA GCC GGA GTT TAC AAA GAA GGA ACA TTC CAT ACA ATG TGG CAT 4674 

He Gly Ala Gly Val Tyr Lys Glu Gly Thr Phe His Thr Met Trp His 

1515 1520 1525 

GTC ACA CGT GGC GCT GTT CTA ATG CAT AAA GGA AAG AGG ATT GAA CCA 4722 
Val Thr Arg Gly Ala Val Leu Met His Lys Gly Lys Arg He Glu Pro 

1530 1535 1540 

TCA TGG GCG GAC GTC AAG AAA GAC CTA ATA TCA TAT GGA GGA GGC TGG 4770 
Ser Trp Ala Asp Val Lys Lys Asp Leu He Ser Tyr Gly Gly Gly Trp 
1545 1550 1555 

AAG TTA GAA GGA GAA TGG AAG GAA GGA GAA GAA GTC CAG GTA TTG GCA 4818 
Lys Leu Glu Gly Glu Trp Lys Glu Gly Glu Glu Val Gin Val Leu Ala 
1560 1565 1570 

CTG GAG CCT GGA AAA AAT CCA AGA GCC GTC CAA ACG AAA CCT GGT CTT 4866 

Leu Glu Pro Gly Lys Asn Pro Arg Ala Val Gin Thr Lys Pro Gly Leu 
1575 1580 1585 1590 

TTC AAA ACC AAC GCC GGA ACA ATA GGT GCT GTA TCT CTG GAC TTT TCT 4914 
Phe Lys Thr Asn Ala Gly Thr He Gly Ala Val Ser Leu Asp Phe Ser 

1595 1600 1605 

CCT GGA ACG TCA GGA TCT CCA ATT ATC GAC AAA AAA GGA AAA GTT GTG 4962 

Pro Gly Thr Ser Gly Ser Pro He He Asp Lys Lys Gly Lys Val Val 

1610 1615 1620 

GGT CTT TAT GGT AAT GGT GTT GTT ACA AGG AGT GGA GCA TAT GTG AGT 5010 
Gly Leu Tyr Gly Asn Gly Val Val Thr Arg Ser Gly Ala Tyr Val Ser 
1625 1630 1635 
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GCT ATA GCC CAG ACT GAA AAA AGC ATT GAA GAC AAC CCA GAG ATC GAA 5058 
Ala He Ala Gin Thr Glu Lys Ser He Glu Asp Asn Pro Glu He Glu 
1640 1645 1650 

GAT GAC ATT TTC CGA AAG AGA AGA CTG ACC ATC ATG GAC CTC CAC CCA 5106 
Asp Asp He Phe Arg Lys Arg Arg Leu Thr He Met Asp Leu His Pro 
1655 1660 1665 1670 

GGA GCG GGA AAG ACG AAG AGA TAC CTT CCG GCC ATA GTC AGA GAA GCT 5154 
Gly Ala Gly Lys Thr Lys Arg Tyr Leu Pro Ala He Val Arg Glu Ala 

1675 1680 1685 

ATA AAA CGG GGT TTG AGA ACA TTA ATC TTG GCC CCC ACT AGA GTT GTG 5202 
He Lys Arg Gly Leu Arg Thr Leu lie Leu Ala Pro Thr Arg Val Val 

1690 1695 1700 

GCA GCT GAA ATG GAG GAA GCC CTT AGA GGA CTT CCA ATA AGA TAC CAG 5250 
Ala Ala Glu Met Glu Glu Ala Leu Arg Gly Leu Pro He Arg Tyr Gin 
1705 1710 1715 

ACC CCA GCC ATC AGA GCT GAG CAC ACC GGG CGG GAG ATT GTG GAC CTA 5298 
Thr Pro Ala He Arg Ala Giu His Thr Gly Arg Glu He Val Asp Leu 
1720 1725 1730 

ATG TGT CAT GCC ACA TTT ACC ATG AGG CTG CTA TCA CCA GTT AGA GTG 5346 
Met Cys His Ala Thr Phe Thr Met Arg Leu Leu Ser Pro Val Arg Val 
1735 1740 1745 1750 

CCA AAC TAC AAC CTG ATT ATC ATG GAC GAA GCC CAT TTC ACA GAC CCA 5394 
Pro Asn Tyr Asn Leu lie lie Met Asp Glu Ala His Phe Thr Asp Pro 

1755 1760 1765 

GCA AGT ATA GCA GCT AGA GGA TAC ATC TCA ACT CGA GTG GAG ATG GGT 5442 
Ala Ser He Ala Ala Arg Gly Tyr He Ser Thr Arg Val Glu Met Gly 

1770 1775 1780 

GAG GCA GCT GGG ATT TTT ATG ACA GCC ACT CCC CCG GGA AGC AGA GAC 5490 
Glu Ala Ala Gly He Phe Met Thr Ala Thr Pro Pro Gly Ser Arg Asp 
1785 1790 1795 

CCA TTT CCT CAG AGC AAT GCA CCA ATC ATA GAT GAA GAA AGA GAA ATC 5538 
Pro Phe Pro Gin Ser Asn Ala Pro He He Asp Glu Glu Arg Glu He 
1800 1805 1810 

* • 

CCT GAA CGT TCG TGG AAT TCC GGA CAT GAA TGG GTC ACG GAT TTT AAA 5586 
Pro Glu Arg Ser Tip Asn Ser Gly His Glu Trp Val Thr Asp Phe Lys 
1815 1820 1825 1830 

GGG AAG ACT GTT TGG TTC GTT CCA AGT ATA AAA GCA GGA AAT GAT ATA 5634 
Gly Lys Thr Val Trp Phe Val Pro Ser He Lys Ala Gly Asn Asp He 

1835 1840 1845 

GCA GCT TGC CTG AGG AAA AAT GGA AAG AAA GTG ATA CAA CTC AGT AGG 5682 
Ala Ala Cys Leu Arg Lys Asn Gly Lys Lys Val He Gin Leu Ser Arg 

1850 1855 1860 

AAG ACC TTT GAT TCT GAG TAT GTC AAG ACT AGA ACC AAT GAT TGG GAC 5730 
Lys Thr Phe Asp Ser Glu Tyr Val Lys Thr Arg Thr Asn Asp Trp Asp 
1865 1870 1875 
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TTC GTG GTT ACA ACT GAC ATT TCA GAA ATG GGT GCC AAT TTC AAG GCT 5778 

Phe Val Val Thr Thr Asp He Ser Glu Met Gly Ala Asn Phe Lys Ala 
1880 1885 1890 

GAG AGG GTT ATA GAC CCC AGA CGC TGC ATG AAA CCA GTC ATA CTA ACA 5826 
Glu Arg Val He Asp Pro Arg Arg Cys Met Lys Pro Val He Leu Thr 
1895 1900 1905 1910 

GAT GGT GAA GAG CGG GTG ATT CTG GCA GGA CCT ATG CCA GTG ACC CAC 5874 
Asp Gly Glu Glu Arg Val He Leu Ala Gly Pro Met Pro Val Thr His 

1915 1920 1925 

TCT AGT GCA GCA CAA AGA AGA GGG AGA ATA GGA AGA AAT CCA AAA AAT 5922 
Ser Ser Ala Ala Gin Arg Arg Gly Arg He Gly Arg Asn Pro Lys Asn 

1930 1935 1940 

GAG AAT GAC CAG TAC ATA TAC ATG GGG GAA CCT CTG GAA AAT GAT GAA 5970 
Glu Asn Asp Gin Tyr He Tyr Met Gly Glu Pro Leu Glu Asn Asp Glu 
1945 1950 1955 

GAC TGT GCA CAC TGG AAA GAA GCT AAA ATG CTC CTA GAT AAC ATC AAC 6018 
Asp Cys Ala His Trp Lys Glu Ala Lys Met Leu Leu Asp Asn He Asn 
1960 1965 1970 

ACG CCA GAA GGA ATC ATT CCT AGC ATG TTC GAA CCA GAG CGT GAA AAG 6066 
Thr Pro Glu Gly He He Pro Ser Met Phe Glu Pro Glu Arg Glu Lys 
1975 1980 1985 1990 

GTG GAT GCC ATT GAT GGC GAA TAC CGC TTG AGA GGA GAA GCA AGG AAA 6114 
Val Asp Ala He Asp Gly Glu Tyr Arg Leu Arg Gly Glu Ala Arg Lys 

1995 2000 2005 

ACC TTT GTA GAC TTA ATG AGA AGA GGA GAC CTA CCA GTC TGG TTG GCC 6162 
Thr Phe Val Asp Leu Met Arg Arg Gly Asp Leu Pro Val Trp Leu Ala 

2010 2015 2020 

TAC AGA GTG GCA GCT GAA GGC ATC AAC TAC GCA GAC AGA AGG TGG TGT 6210 
Tyr Arg Val Ala Ala Glu Gly He Asn Tyr Ala Asp Arg Arg Trp Cys 
2025 2030 2035 

TTT GAT GGA GTC AAG AAC AAC CAA ATC CTA GAA GAA AAC GTG GAA GTT 6258 
Phe Asp Gly Val Lys Asn Asn Gin He Leu Glu Glu Asn Val Glu Val 
2040 2045 2050 

GAA ATC TGG ACA AAA GAA GGG GAA AGG AAG AAA TTG AAA CCC AGA TGG 6306 
Glu He Trp Thr Lys Glu Gly Glu Arg Lys Lys Leu Lys Pro Arg Trp 
2055 2060 2065 2070 

TTG GAT GCT AGG ATC TAT TCT GAC CCA CTG GCG CTA AAA GAA TTT AAG 6354 
Leu Asp Ala Arg He Tyr Ser Asp Pro Leu Ala Leu Lys Glu Phe Lys 

2075 2080 2085 

GAA TTT GCA GCC GGA AGA AAG TCT CTG ACC CTG AAC CTA ATC ACA GAA 6402 
Glu Phe Ala Ala Gly Arg Lys Ser Leu Thr Leu Asn Leu He Thr Glu 

2090 2095 2100 

ATG GGT AGG CTC CCA ACC TTC ATG ACT CAG AAG GCA AGA GAC GCA CTG 6450 

Met Gly Arg Leu Pro Thr Phe Met Thr Gin Lys Ala Arg Asp Ala Leu 
2105 2110 2115 
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GAC AAC TTA GCA GTG CTG CAC ACG GCT GAG GCA GGT GGA AGG GCG TAC 6498 
Asp Asn Leu Ala Val Leu His Thr Ala Glu Ala Gly Gly Arg Ala Tyr 
2120 2125 2130 

AAC CAT GCT CTC AGT GAA CTG CCG GAG ACC CTG GAG ACA TTG CTT TTA 6546 
Asn His Ala Leu Ser Glu Leu Pro Glu Thr Leu Glu Thr Leu Leu Leu 
2135 2140 2145 2150 

CTG ACA CTT CTG GCT ACA GTC ACG GGA GGG ATC TTT TTA TTC TTG ATG 6594 
Leu Thr Leu Leu Ala Thr Val Thr Gly Gly lie Phe Leu Phe Leu Met 

2155 2160 2165 

AGC GCA AGG GGC ATA GGG AAG ATG ACC CTG GGA ATG TGC TGC ATA ATC 6642 
Ser Ala Arg Gly lie Gly Lys Met Thr Leu Gly Met Cys Cys lie lie 

2170 2175 2180 

ACG GCT AGC ATC CTC CTA TGG TAC GCA CAA ATA CAG CCA CAC TGG ATA 6690 
Thr Ala Ser lie Leu Leu Trp Tyr Ala Gin He Gin Pro His Tip He 
2185 2190 2195 

GCA GCT TCA ATA ATA CTG GAG TTT TTT CTC ATA GTT TTG CTT ATT CCA 6738 
Ala Ala Ser He He Leu Glu Phe Phe Leu He Val Leu Leu He Pro 
2200 2205 2210 

GAA CCT GAA AAA CAG AGA ACA CCC CAA GAC AAC CAA CTG ACC TAC GTT 6786 
Glu Pro Glu Lys Gin Arg Thr Pro Gin Asp Asn Gin Leu Thr Tyr Val 
2215 2220 2225 2230 

GTC ATA GCC ATC CTC ACA GTG GTG GCC GCA ACC ATG GCA AAC GAG ATG 6834 
Val He Ala He Leu Thr Val Val Ala Ala Thr Met Ala Asn Glu Met 

2235 2240 2245 

GGT TTC CTA GAA AAA ACG AAG AAA GAT CTC GGA TTG GGA AGC ATT GCA 6882 
Gly Phe Leu Glu Lys Thr Lys Lys Asp Leu Gly Leu Gly Ser lie Ala 

2250 2255 2260 

ACC CAG CAA CCC GAG AGC AAC ATC CTG GAC ATA GAT CTA CGT CCT GCA 6930 
Thr Gin Gin Pro Glu Ser Asn He Leu Asp He Asp Leu Arg Pro Ala 
2265 2270 2275 

TCA GCA TGG ACG CTG TAT GCC GTG GCC ACA ACA TTT GTT ACA CCA ATG 6978 
Ser Ala Trp Thr Leu Tyr Ala Val Ala Thr Thr Phe Val Thr Pro Met 
2280 2285 2290 

TTG AGA CAT AGC ATT GAA AAT TCC TCA GTG AAT GTG TCC CTA ACA GCT 7026 
Leu Arg His Ser He Glu Asn Ser Ser Val Asn Val Ser Leu Thr Ala 
2295 2300 2305 2310 

ATA GCC AAC CAA GCC ACA GTG TTA ATG GGT CTC GGG AAA GGA TGG CCA 7074 
He Ala Asn Gin Ala Thr Val Leu Met Gly Leu Gly Lys Gly Trp Pro 

2315 2320 2325 

TTG TCA AAG ATG GAC ATC GGA GTT CCC CTT CTC GCC ATT GGA TGC TAC 7122 
Leu Ser Lys Met Asp He Gly Val Pro Leu Leu Ala He Gly Cys Tyr 

2330 2335 2340 

TCA CAA GTC AAC CCC ATA ACT CTC ACA GCA GCT CTT TTC TTA TTG GTA 7170 
Ser Gin Val Asn Pro He Thr Leu Thr Ala Ala Leu Phe Leu Leu Val 
2345 2350 2355 
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GCA CAT TAT GCC ATC ATA GGG CCA GGA CTC CAA GCA AAA GCA ACC AGA 7218 
Ala His Tyr Ala He He Gly Pro Gly Leu Gin Ala Lys Ala Thr Arg 
2360 2365 2370 

GAA GCT CAG AAA AGA GCA GCG GCG GGC ATC ATG AAA AAC CCA ACT GTC 7266 
Glu Ala Gin Lys Arg Ala Ala Ala Gly He Met Lys Asn Pro Thr Val 
2375 2380 2385 2390 

GAT GGA ATA ACA GTG ATT GAC CTA GAT CCA ATA CCT TAT GAT CCA AAG 7314 

Asp Gly lie Thr Val He Asp Leu Asp Pro He Pro Tyr Asp Pro Lys 

2395 2400 2405 

TTT GAA AAG CAG TTG GGA CAA GTA ATG CTC CTA GTC CTC TGC GTG ACT 7362 
Phe Glu Lys Gin Leu Gly Gin Val Met Leu Leu Val Leu Cys Val Thr 

2410 2415 2420 

CAA GTA TTG ATG ATG AGG ACT ACA TGG GCT CTG TGT GAG GCT TTA ACC 7410 
Gin Val Leu Met Met Arg Thr Thr Trp Ala Leu Cys Glu Ala Leu Thr 
2425 2430 2435 

TTA GCT ACC GGG CCC ATC TCC ACA TTG TGG GAA GGA AAT CCA GGG AGG 7458 
Leu Ala Thr Gly Pro He Ser Thr Leu Trp Glu Gly Asn Pro Gly Arg 
2440 2445 2450 

TTT TGG AAC ACT ACC ATT GCG GTG TCA ATG GCT AAC ATT TTT AGA GGG 7506 

Phe Trp Asn Thr Thr He Ala Val Ser Met Ala Asn He Phe Arg Gly 
2455 2460 2465 2470 

AGT TAC TTG GCC GGA GCT GGA CTT CTC TTT TCT ATT ATG AAG AAC ACA 7554 

Ser Tyr Leu Ala Gly Ala Gly Leu Leu Phe Ser He Met Lys Asn Thr 

2475 2480 2485 

ACC AAC ACA AGA AGG GGA ACT GGC AAC ATA GGA GAG ACG CTT GGA GAG 7602 
Thr Asn Thr Arg Arg Gly Thr Gly Asn He Gly Glu Thr Leu Gly Glu 

2490 2495 2500 

AAA TGG AAA AGC CGA TTG AAC GCA TTG GGA AAA AGT GAA TTC CAG ATC 7650 
Lys Trp Lys Ser Arg Leu Asn Ala Leu Gly Lys Ser Glu Phe Gin He 
2505 2510 2515 

TAC AAG AAA AGT GGA ATC CAG GAA GTG GAT AGA ACC TTA GCA AAA GAA 7698 
Tyr Lys Lys Ser Gly He Gin Glu Val Asp Arg Thr Leu Ala Lys Glu 
2520 2525 2530 

GGC ATT AAA AGA GGA GAA ACG GAC CAT CAC GCT GTG TCG CGA GGC TCA 7746 

Gly He Lys Arg Gly Glu Thr Asp His His Ala Val Ser Arg Gly Ser 
2535 2540 2545 2550 

GCA AAA CTG AGA TGG TTC GTT GAG AGA AAC ATG GTC ACA CCA GAA GGG 7794 
Ala Lys Leu Arg Trp Phe Val Glu Arg Asn Met Val Thr Pro Glu Gly 

2555 2560 2565 

AAA GTA GTG GAC CTC GGT TGT GGC AGA GGA GGC TGG TCA TAC TAT TGT 7842 
Lys Val Val Asp Leu Gly Cys Gly Arg Gly Gly Trp Ser Tyr Tyr Cys 

2570 2575 2580 

GGA GGA CTA AAG AAT GTA AGA GAA GTC AAA GGC CTA ACA AAA GGA GGA 7890 
Gly Gly Leu Lys Asn Val Arg Glu Val Lys Gly Leu Thr Lys Gly Gly 
2585 2590 2595 
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CCA GGA CAC GAA GAA CCC ATC CCC ATG TCA ACA TAT GGG TGG AAT CTA 7938 
Pro Gly His Glu Glu Pro lie Pro Met Ser Thr Tyr Gly Trp Asn Leu 
2600 2605 2610 

GTG CGT CTT CAA AGT GGA GTT GAC GTT TTC TTC ATC CCG CCA GAA AAG 7986 

Val Arg Leu Gin Ser Gly Val Asp Val Phe Phe lie Pro Pro Glu Lys 
2615 2620 2625 2630 

TGT GAC ACA TTA TTG TGT GAC ATA GGG GAG TCA TCA CCA AAT CCC ACA 8034 
Cys Asp Thr Leu Leu Cys Asp lie Gly Glu Ser Ser Pro Asn Pro Thr 

2635 2640 2645 

GTG GAA GCA GGA CGA ACA CTC AGA GTC CTT AAC TTA GTA GAA AAT TGG 8082 
Val Glu Ala Gly Arg Thr Leu Arg Val Leu Asn Leu Val Glu Asn Trp 

2650 2655 2660 

TTG AAC AAC AAC ACT CAA TTT TGC ATA AAG GTT CTC AAC CCA TAT ATG 8130 

Leu Asn Asn Asn Thr Gin Phe Cys He Lys Val Leu Asn Pro Tyr Met 
2665 2670 2675 

CCC TCA GTC ATA GAA AAA ATG GAA GCA CTA CAA AGG AAA TAT GGA GGA 8178 
Pro Ser Val He Glu Lys Met Glu Ala Leu Gin Arg Lys Tyr Gly Gly 
2680 2685 2690 

GCC TTA GTG AGG AAT CCA CTC TCA CGA AAC TCC ACA CAT GAG ATG TAC 8226 
Ala Leu Val Arg Asn Pro Leu Ser Arg Asn Ser Thr His Glu Met Tyr 
2695 2700 2705 2710 

TGG GTA TCC AAT GCT TCC GGG AAC ATA GTG TCA TCA GTG AAC ATG ATT 8274 
Trp Val Ser Asn Ala Ser Gly Asn He Val Ser Ser Val Asn Met lie 

2715 2720 2725 

TCA AGG ATG TTG ATC AAC AGA TTT ACA ATG AGA TAC AAG AAA GCC ACT 8322 
Ser Arg Met Leu He Asn Arg Phe Thr Met Arg Tyr Lys Lys Ala Thr 

2730 2735 2740 

TAC GAG CCG GAT GTT GAC CTC GGA AGC GGA ACC CGT AAC ATC GGG ATT 8370 
Tyr Glu Pro Asp Val Asp Leu Gly Ser Gly Thr Arg Asn He Gly He 
2745 2750 2755 

GAA AGT GAG ATA CCA AAC CTA GAT ATA ATT GGG AAA AGA ATA GAA AAA 8418 

Glu Ser Glu He Pro Asn Leu Asp He He Gly Lys Arg He Glu Lys 
2760 2765 2770 

ATA AAG CAA GAG CAT GAA ACA TCA TGG CAC TAT GAC CAA GAC CAC CCA 8466 
He Lys Gin Glu His Glu Thr Ser Trp His Tyr Asp Gin Asp His Pro 
2775 2780 2785 2790 

TAC AAA ACG TGG GCA TAC CAT GGT AGC TAT GAA ACA AAA CAG ACT GGA 8514 
Tyr Lys Thr Trp Ala Tyr His Gly Ser Tyr Glu Thr Lys Gin Thr Gly 

2795 2800 2805 

TCA GCA TCA TCC ATG GTC AAC GGA GTG GTC AGG CTG CTG ACA AAA CCT 8562 

Ser Ala Ser Ser Met Val Asn Gly Val Val Arg Leu Leu Thr Lys Pro 

2810 2815 2820 

TGG GAC GTT GTC CCC ATG GTG ACA CAG ATG GCA ATG ACA GAC ACG ACT 8610 
Trp Asp Val Val Pro Met Val Thr Gin Met Ala Met Thr Asp Thr Thr 
2825 2830 2835 
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CCA TTT GGA CAA CAG CGC GTT TTT AAA GAG AAA GTG GAC ACG AGA ACC 8658 
Pro Phe Gly Gin Gin Arg Val Phe Lys Glu Lys Val Asp Thr Arg Thr 
2840 2845 2850 

CAA GAA CCG AAA GAA GGC ACG AAG AAA CTA ATG AAA ATA ACA GCA GAG 8706 
Gin Glu Pro Lys Glu Gly Thr Lys Lys Leu Met Lys Lie Thr Ala Glu 
2855 2860 2865 2870 

TGG CTT TGG AAA GAA TTA GGG AAG AAA AAG ACA CCC AGG ATG TGC ACC 8754 
Trp Leu Trp Lys Glu Leu Gly Lys Lys Lys Thr Pro Arg Met Cys Thr 

2875 2880 2885 

AGA GAA GAA TTC ACA AGA AAG GTG AGA AGC AAT GCA GCC TTG GGG GCC 8802 

Arg Glu Glu Phe Thr Arg Lys Val Arg Ser Asn Ala Ala Leu Gly Ala 

2890 2895 2900 

ATA TTC ACT GAT GAG AAC AAG TGG AAG TCG GCA CGT GAG GCT GTT GAA 8850 
lie Phe Thr Asp Glu Asn Lys Trp Lys Ser Ala Arg Glu Ala Val Glu 
2905 2910 2915 

GAT AGT AGG TTT TGG GAG CTG GTT GAC AAG GAA AGG AAT CTC CAT CTT 8898 
Asp Ser Arg Phe Trp Glu Leu Val Asp Lys Glu Arg Asn Leu His Leu 
2920 2925 2930 

GAA GGA AAG TGT GAA ACA TGT GTG TAC AAC ATG ATG GGA AAA AGA GAG 8946 
Glu Gly Lys Cys Glu Thr Cys Val Tyr Asn Met Met Gly Lys Arg Glu 
2935 2940 2945 2950 

AAG AAG CTA GGG GAA TTC GGC AAG GCA AAA GGC AGC AGA GCC ATA TGG 8994 
Lys Lys Leu Gly Glu Phe Gly Lys Ala Lys Gly Ser Arg Ala lie Trp 

2955 2960 2965 

TAC ATG TGG CTT GGA GCA CGC TTC TTA GAG TTT GAA GCC CTA GGA TTC 9042 
Tyr Met Trp Leu Gly Ala Arg Phe Leu Glu Phe Glu Ala Leu Gly Phe 

2970 2975 2980 

TTA AAT GAA GAT CAC TGG TTC TCC AGA GAG AAC TCC CTG AGT GGA GTG 9090 
Leu Asn Glu Asp His Trp Phe Ser Arg Glu Asn Ser Leu Ser Gly Val 
2985 2990 2995 

GAA GGA GAA GGG CTG CAC AAG CTA GGT TAC ATT CTA AGA GAC GTG AGC 9138 
Glu Gly Glu Gly Leu His Lys Leu Gly Tyr lie Leu Arg Asp Val Ser 
3000 3005 3010 

AAG AAA GAG GGA GGA GCA ATG TAT GCC GAT GAC ACC GCA GGA TGG GAT 9186 

Lys Lys Glu Gly Gly Ala Met Tyr Ala Asp Asp Thr Ala Gly Trp Asp 
3015 3020 3025 3030 

ACA AGA ATC ACA CTA GAA GAC KKA AAA AAT GAA GAA ATG GTA ACA AAC 9234 
Thr Arg lie Thr Leu Glu Asp Xaa Lys Asn Glu Glu Met Val Thr Asn 

3035 3040 3045 

CAC ATG GAA GGA GAA CAC AAG AAA CTA GCC GAG GCC ATT TTC AAA CTA 9282 
His Met Glu Gly Glu His Lys Lys Leu Ala Glu Ala He Phe Lys Leu 

3050 3055 3060 

ACG TAC CAA AAC AAG GTG GTG CGT GTG CAA AGA CCA ACA CCA AGA GGC 9330 
Thr Tyr Gin Asn Lys Val Val Arg Val Gin Arg Pro Thr Pro Arg Gly 
3065 3070 3075 
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ACA GTA ATG GAC ATC ATA TCG AGA AGA GAC CAA AGA GGT AGT GGA CAA 9378 
Thr Val Met Asp lie lie Ser Arg Arg Asp Gin Arg Gly Ser Gly Gin 
3080 3085 3090 

GTT GGC ACC TAT GGA CTC AAT ACT TTC ACC AAT ATG GAA GCC CAA CTA 9426 

Val Gly Thr Tyr Gly Leu Asn Thr Phe Thr Asn Met Glu Ala Gin Leu 
3095 3100 3105 3110 

ATC AGA CAG ATG GAG GGA GAA GGA GTC TTT AAA AGC ATT CAG CAC CTA 9474 
lie Arg Gin Met Glu Gly Glu Gly Val Phe Lys Ser He Gin His Leu 

3115 3120 3125 

ACA ATC ACA GAA GAA ATC GCT GTG CAA AAC TGG TTA GCA AGA GTG GGG 9522 
Thr He Thr Glu Glu He Ala Val Gin Asn Trp Leu Ala Arg Val Gly 

3130 3135 3140 

CGC GAA AGG TTA TCA AGA ATG GCC ATC AGT GGA GAT GAT TGT GTT GTG 9570 
Arg Glu Arg Leu Ser Arg Met Ala He Ser Gly Asp Asp Cys Val Val 
3145 3150 3155 

AAA CCT TTA GAT GAC AGG TTC GCA AGC GCT TTA ACA GCT CTA AAT GAC 9618 
Lys Pro Leu Asp Asp Arg Phe Ala Ser Ala Leu Thr Ala Leu Asn Asp 
3160 3165 3170 

ATG GGA AAG ATT AGG AAA GAC ATA CAA CAA TGG GAA CCT TCA AGA GGA 9666 
Met Gly Lys He Arg Lys Asp He Gin Gin Trp Glu Pro Ser Arg Gly 
3175 3180 3185 3190 

TGG AAT GAT TGG ACA CAA GTG CCC TTC TGT TCA CAC CAT TTC CAT GAG 9714 
Trp Asn Asp Trp Thr Gin Val Pro Phe Cys Ser His His Phe His Glu 

3195 3200 3205 

TTA ATC ATG AAA GAC GGT CGC GTA CTC GTT GTT CCA TGT AGA AAC CAA 9762 

Leu He Met Lys Asp Gly Arg Val Leu Val Val Pro Cys Arg Asn Gin 

3210 3215 3220 

GAT GAA CTG ATT GGC AGA GCC CGA ATC TCC CAA GGA GCA GGG TGG TCT 9810 
Asp Glu Leu He Gly Arg Ala Arg He Ser Gin Gly Ala Gly Trp Ser 
3225 3230 3235 

TTG CGG GAG ACG GCC TGT TTG GGG AAG TCT TAC GCC CAA ATG TGG AGC 9858 
Leu Arg Glu Thr Ala Cys Leu Gly Lys Ser Tyr Ala Gin Met Trp Ser 
3240 3245 3250 

TTG ATG TAC TTC CAC AGA CGC GAC CTC AGG CTG GCG GCA AAT GCT ATT 9906 
Leu Met Tyr Phe His Arg Arg Asp Leu Arg Leu Ala Ala Asn Ala He 
3255 3260 3265 3270 

TGC TCG GCA GTA CCA TCA CAT TGG GTT CCA ACA AGT CGA ACA ACC TGG 9954 

Cys Ser Ala Val Pro Ser His Trp Val Pro Thr Ser Arg Thr Thr Trp 

3275 3280 3285 

TCC ATA CAT GCT AAA CAT GAA TGG ATG ACA ACG GAA GAC ATG CTG ACA 10002 

Ser He His Ala Lys His Glu Trp Met Thr Thr Glu Asp Met Leu Thr 

3290 3295 3300 

GTC TGG AAC AGG GTG TGG ATT CAA GAA AAC CCA TGG ATG GAA GAC AAA 10050 
Val Trp Asn Arg Val Trp He Gin Glu Asn Pro Trp Met Glu Asp Lys 
3305 3310 3315 
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ACT CCA GTG GAA TCA TGG GAG GAA ATC CCA TAC TTG GGG AAA AGA GAA 1 0098 
Thr Pro Val Glu Ser Trp Glu Glu lie Pro Tyr Leu Gly Lys Arg Glu 
3320 3325 3330 

GAC CAA TGG TGC GGC TCA TTG ATT GGG TTA ACA AGC AGG GCC ACC TGG 10146 

Asp Gin Trp Cys Gly Ser Leu lie Gly Leu Thr Ser Arg Ala Thr Trp 
3335 3340 3345 3350 

GCA AAG AAC ATC CAA GCA GCA ATA AAT CAA GTT AGA TCC CTT ATA GGC 10194 
Ala Lys Asn He Gin Ala Ala He Asn Gin Val Arg Ser Leu He Gly 

3355 3360 3365 

AAT GAA GAA TAC ACA GAT TAC ATG CCA TCC ATG AAA AGA TTC AGA AGA 10242 
Asn Glu Glu Tyr Thr Asp Tyr Met Pro Ser Met Lys Arg Phe Arg Arg 

3370 3375 3380 

GAA GAG GAA GAA GCA GGA GTT CTG TGG TAGAAAGCAA AACTAACATG AAACAAGG 10297 

Glu Glu Glu Glu Ala Gly Val Leu Trp 

3385 3390 

CTAGAAGTCA GGTCGGATTA AGCCATAGTA CGGAAAAAAC TATGCTACCT GTGAGCCCCG 10357 

TCCAAGGACG TTAAAAGAAG TCAGGCCATC ATAAATGCCA TAGCTTGAGT AAACTATGCA 10417 

GCCTGTAGCT CCACCTGAGA AGGTGTAAAA AATCCGGGAG GCCACAAACC ATGGAAGCTG 10477 

TACGCATGGC GTAGTGGACT AGCGGTTAGA GAGGACCCCT CCCTTACAAA TCGCAGCAAC 10537 

AATGGGGGCC CAAGGCGAGA TGAAGCTGTA GTCTCGCTGG AAGGACTAGA GGTTAGAGGA 10597 

GACCCCCCCG AAACAAAAAA CAGCATATTG ACGCTGGGAA AGACCAGAGA TCCTGCTGTC 10657 

TCCTCAGCAT CATTCCAGGC ACAGAACGCC AGAAAATGGA ATGGTGCTGT TGAATCAACA 10717 

GGTTCT 10723 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCCAGTCACG ACGTTGTAAA ACGAC 25 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
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GGATGTGCTG CAAGGCGATT AAGTTGG 27 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TGAGCGGATA ACAATTTCAC ACAGG 25 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GGCTTTACAC TTTATGCTTC CGGCTCG 27 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

GCGGATATTG GAATTCTCTA GAAATTTAAT ACGACTCACT ATAAGTTGTT AGTCTACGTG 60 
GACCGACAAA GACAG 75 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 77 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

CCAGTGAATT CGAGCTCACG CGTAAATTTA ATACGACTCA CTATAAGTTG TTAGTCTACG 60 
TGGACCGACA AAGACAG 77 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
AGTTGTTAGT CTACGTGGAC CGAC 24 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GACAGATTCT TTGAGGGAGC TGAGCTCAAC GTAG 34 
(2) INFORMATION FOR SEQ ID NO:11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TCAATATGCT GAAACGCGAG AGAAACCG 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGGATTGTTA GGAAACGAAG GAACGC 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE : 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CCACCAACAG CAGGGATACT GAAAAGATGG GG 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TGCAGATCTG CGTCTCCTAT TCAAG 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CGTGAACATG TGTACCCTCA TGGCC 

* 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TTGCACCAAC AGTCAATGTC TTCAGG 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
ACCAGAAGAC ATAGATTGTT GGTGC 

(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
GCACCAACAG TCTATGTCTT CTGGC 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
ATGTTTCCAG GCCCCTTCTG ATGAC 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20 
GCAGCAATCC TGGCATACAC CATAG 

(2) INFORMATION FOR SEQ ID NO:21 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 
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(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
GGTTGACATA GTCTTAGAAC ATGGAAG 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
CTTCCATGTT CTAAGACTAT GTCAACC 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) length: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
GTCTTAGAAC ATGGAAGTTG TGTGACGACG ATGGC 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
ACAACAGAAT CTCGCTGCCC AACAC 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
GCAAACACTC CATGGTAGAC AGAGG 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
CCTCTGTCTA CCATGGAGTG TTTGC 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
CCACATCCAT TTCCCCATCC TCTGTCT 

(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL : NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GGAAAGGGAG GCATTGTGAC CTGTGCTATG 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GGAAATCAAA ATAACACCAC AGAGTTCC 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTGCAGCAAC ACCATCTCAT TGAAGTCGAG GCCC 

(2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GACTTCAATG AGATGGTGCT GCTGC 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GCAGCAGCAC CATCTCATTG AAGTC 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AAGCTTGGCT GGTGCACAGG CAATGGTT 

(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TGGTAACGGC AGGTCTAGGA ACCATTG 

(2) INFORMATION FOR SEQ ID NO: 35: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGACATCTCA AGTGCAGGCT GAG 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CTCAGCCTGC ACTTGAGATG TCC 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GAAGGAAATA GCAGAAACAC AACATGG 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 38 
CCCTTCATAT TGTACTCTGA TAACTATTGT TCC 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
CCTCCATTCG GAGACAGCTA CATCATCATA GG 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
CCTATGATGA TGTAGCTGTC TCCGAATGGA GG 

(2) INFORMATION FOR SEQ ID NO:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 
ATGGCCATTT TAGGTGACAC AGCCTGGGA 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
TGTAAACACT CCTCCCAGGG ATCCAAA 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 
CTCATAGGAG TCATTATCAC ATGGATAGG 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
GGGGATTCTG GTTGGAACTT ATATTGTTCT GTCC 

(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
TGATTCAATT CTGGTGTTAT TTGTTTCCAC 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 
AAGGAATCAT GCAGGCAGGA AAACG 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
ACTTCCAGCG AGTTCCAAGC TC 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 
AACAGAGCCG TCCATGCCGA TATGG 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 
TCCATTGCTC CAAAGGGTGT GT 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 50 
AGCTTGAGAT GGACTTTGAT TTCTG 

(2) INFORMATION FOR SEQ ID NO:51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
GGTCTGATTT CCATCCCGTA CC 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 52 
GTCCTTTAGA GACCTGGGAA GAG 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
GTTTTCTCAA GAGTAGTCCA GCTGC 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS:. single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 
ATCAATTGGC AGTGACTATC ATGGC 

(2) INFORMATION FOR SEQ ID NO: 55: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
TGTTAAGAGC AGTGGAGAAA CGGAC 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
GATTGAGACC TTTGATCGTC AACGC 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
TGACAGGACC ATTAGTGGCT GGAGG 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
CGTGCTCACT GGACGATCGG CCGATTTGGA ACTG 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 
GGGCTGCTTC CTGATATTTC TGCC 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 
CCTGTGGGAA GTGAAGAAAC AACGG 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 



WO 96/40933 



154 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 
GCTCCATCTT CCAGTTCAGC CTTTCCCATG 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 
CTCCGGCTCC AATCTGAGAG TATCC 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 
CCTAATATCA TATGGAGGAG GCTGG 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 
GAAGGAGAAG AAGTCCAGGT ATTGG 

(2) INFORMATION FOR SEQ ID NO: 65: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 
CTGTCGACAA TTGGAGATCC TGACG 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic, acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 
GTGGAGCATA TGTGAGTGCT ATAGC 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: . 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 
TCTGACTATG GCCGGAAGGT ATCTC 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
ACATTAATCT TGGCCCCCAC TAGAG 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 
CGATCTCCCG CCCGGTGTG 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 
CTAACTGGTG ATAGCAGCCT CATGG 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 
CCTACTGAGT TGTATCACTT TCTTTCC 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 
TGGATTTCTT CCTATTCTCC CTCTTC 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 
TTCAAGGCTG AGAGGGTTAT AGACC 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

.(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74 
TCTGGTTGGC CTACAGAGTG GCAGC 

(2) INFORMATION FOR SEQ ID NO: 75: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTIS&JSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
CCTTCTTTTG TCCAGATTTC CACTTCC 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76 
GCGTACAACC ATGCTCTCAG TGAACTGCCG GAGAC 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 
TTCCCAGGGT CATCTTCCCT ATAC 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL : NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 
GATGCTAGCC GTGATTATGC AGCACATTCC C 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79 
AAACAGAGAA CACCCCAAGA CAACC 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80 
CGGCATACAG CGTCCATGCT G 

(2) INFORMATION FOR SEQ ID NO: 81 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 
GTCTCGGGAA AGGATGGCCA TTGTC 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82 
CTCTGGTTGC TTTTGCTTGA AGTCC 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 
CCGCCGCTGC TCTTTTCTGA GCTTCTC 

(2) INFORMATION FOR SEQ ID NO:84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 
AGGACTACAT GGGCTCTGTG TGAGG 

(2) INFORMATION FOR SEQ ID NO: 85: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85 
GAGAAGTCCA GCTCCGGCC 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 
AGAGAAACAT GGTCACACCA GAAGG 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) length: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87 
GTTCTTCGTG TCCTGGTCCT CC 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 
GGAAATATGG AGGAGCCTAG TGAGG 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 
ACCCAGTACA TCTCATGTGT GG 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90 
GAGCATGAAA CATCATGGCA CTATGACC 

(2) INFORMATION FOR SEQ ID NO:91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 
TCATGGCACT ATGACCAAGA CCACC 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 
CAGTCTGACC ACTCCGTTCA CC 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 
AAGGTGAGAA GCAATGCAGC CTTGG 

(2) INFORMATION FOR SEQ ID NO:94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
GGGCCATATT CACTGATGAG AACAAGTGG 

(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 



WO 96/40933 



164 



(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95 
TCTTTCCCTG TCAACCAGCT CC 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96 
AATGAAGATC ACTGGTTCTC CAGAG 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97 
ACGTGAGCAA GAAAGAGGGA GGAGC 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
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(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 
(yi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
TGTCCCATCC TGCTGTGTCA TC 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
GCTAGTTTCT TGTGTTCTCC TTCCATGTGG 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 
(V) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
TCATATCGAG AAGAGACCAA AGAGG 

(2) INFORMATION FOR SEQ ID NO: 101 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
ACTCCTTCTC CCTCCATCTG TCTG 
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(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
ATGCTTTTGA AGATTCCTTC TCCCTCC 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 
GCACAGCGAT TTCTTCTGTG ATTGTTAGGT GC 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 
ACAATGGGAA CCTTCAAGAG GATGG 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
TTATCACATT GGATCCTTCA AGAGGATGGA ATGATTGGAC ACAAG 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
CAGAAGGGCA CTTGTGTCCA ATCATTCC 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
CTCCCTGGGA AATTCGGGCT C 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
CCGTCTCCCG CAAAGACCAC CCTGCTCC 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 
TTATCACCTA TCTAGACCGT CTCCCGCAAA GACCACCCTG CTCC 
(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
GTTGGAACCC AATGTGATGG TACTGC 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
ACAAGTCGAA CAACCTGGTC CATAC 

(2) INFORMATION FOR SEQ ID NO: 112: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112 
GCATGTCTTC CGTCGTCATC C 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113 
CTTGAATCCA CACCCTGTTC CAGAC 

(2) INFORMATION FOR SEQ ID NO: 11 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114 
ATACACAGAT TACATGCCAT CCATG 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 
TTTTGCCTTC TACCACAGGA C 

(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116 
GAAACAAGGC TAGAAGTCAG GTCGG 

(2) INFORMATION FOR SEQ ID NO:117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117 
GACGGGGCTC ACAGGTAGCA TAG 

(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118 
GCCTGTAGCT CCACCTGAGA AGGTG 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 
GGAAGCTGTA CGCATGGCGT AGTGG 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120 
GGGCCCCCGT TGTTGCTGC 

(2) INFORMATION FOR SEQ ID NO: 121 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121 
AGAACCTGTT GATTCAACAG CACCATTCCA TTTTCTG 

(2) INFORMATION FOR SEQ ID NO: 122: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
TTATCACCTA GCATGCTCTA GAAGAACCTG TTGATTCAAC AGCACCATTC CATTTTCTG 59 
(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
TTATCACCTA TCTAGAGAAC CTGTTGATTC AACAGCACCA TTCCATTTTC TG 52 
(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2394 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1...2394 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

AGA TTC TCA AAA GGA TTG CTC TCA GGC CAA GGA CCC ATG AAA TTG GTG 48 
Arg Phe Ser Lys Gly Leu Leu Ser Gly Gin Gly Pro Met Lys Leu Val 
1 5 10 15 
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ATG GCT TTC ATA GCA TTC TTA AGA TTT CTA GCC ATA CCC CCA ACA GCA 

Met Ala Phe lie Ala Phe Leu Arg Phe Leu Ala He Pro Pro Thr Ala 

20 25 30 

GGA ATT TTG GCT AGA TGG GGC TCA TTC AAG AAG AAT GGA GCG ATT AAA 
Gly He Leu Ala Arg Trp Gly Ser Phe Lys Lys Asn Gly Ala He Lys 
35 40 45 

GTG TTA CGG GGT TTC AAG AGA GAA ATC TCA AAC ATG CTA AAC ATA ATG 
Val Leu Arg Gly Phe Lys Arg Glu He Ser Asn Met Leu Asn He Met 
50 55 60 

AAC AGG AGG AAA AGA TCC GTG ACC ATG CTC CTT ATG CTG CTG CCC ACA 
Asn Arg Arg Lys Arg Ser Val Thr Met Leu Leu Met Leu Leu Pro Thr 
65 70 75 80 

GCC CTG GCG TTC CAT CTG ACG ACA CGA GGG GGA GAG CCG CAT ATG ATA 
Ala Leu Ala Phe His Leu Thr Thr Arg Gly Gly Glu Pro His Met He 

85 90 95 

GTT AGC AAG CAG GAA AGA GGA AAG TCA CTT TTG TTC AAG ACC TCT GCA 
Val Ser Lys Gin Glu Arg Gly Lys Ser Leu Leu Phe Lys Thr Ser Ala 

100 105 110 

GGT GTC AAC ATG TGC ACC CTC ATT GCG ATG GAT TTG GGA GAG TTG TGT 
Gly Val Asn Met Cys Thr Leu He Ala Met Asp Leu Gly Glu Leu Cys 
115 120 125 

GAG GAC ACG ATG ACC TAC AAA TGC CCC CGG ATC ACT GAG GCG GAA CCA 
Glu Asp Thr Met Thr Tyr Lys Cys Pro Arg He Thr Glu Ala Glu Pro 
130 135 140 

GAT GAC GTT GAC TGT TGG TGC AAT GCC ACG GAC ACA TGG GTG ACC TAT 
Asp Asp Val Asp Cys Trp Cys Asn Ala Thr Asp Thr Trp Val Thr Tyr 
145 150 155 160 

GGA ACG TGC TCT CAA ACT GGC GAA CAC CGA CGA GAC AAA CGT TCC GTC 
Gly Thr Cys Ser Gin Thr Gly Glu His Arg Arg Asp Lys Arg Ser Val 

165 170 175 

GCA TTG GCC CCA CAC GTG GGG CTT GGC CTA GAA ACA AGA GCC GAA ACG 

Ala Leu Ala Pro His Val Gly Leu Gly Leu Glu Thr Arg Ala Glu Thr 

180 185 190 

TGG ATG TCC TCT GAA GGT GCT TGG AAA CAG ATA CAA AAA GTA GAG ACT 
Trp Met Ser Ser Glu Gly Ala Trp Lys Gin He Gin Lys Val Glu Thr 
195 200 205 

TGG GCT CTG AGA CAT CCA GGA TTC ACG GTG ATA GCC CTT TTT CTA GCA 

Trp Ala Leu Arg His Pro Gly Phe Thr Val He Ala Leu Phe Leu Ala 
210 215 220 

CAT GCC ATA GGA ACA TCC ATC ACC CAG AAA GGG ATC ATT TTC ATT TTG 
His Ala He Gly Thr Ser He Thr Gin Lys Gly He He Phe He Leu 
225 230 235 240 

CTG ATG CTG GTA ACA CCA TCT ATG GCC ATG CGA TGC GTG GGA ATA GGC 
Leu Met Leu Val Thr Pro Ser Met Ala Met Arg Cys Val Gly He Gly 

245 250 255 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 
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AAC AGA GAC TTC GTG GAA GGA CTG TCA GGA GCA ACA TGG GTG GAT GTG 

Asn Arg Asp Phe Val Glu Gly Leu Ser Gly Ala Thr Trp Val Asp Val 

260 265 270 

GTA CTG GAG CAT GGA AGT TGC GTC ACC ACC ATG GCA AAA AAC AAA CCA 
Val Leu Glu His Gly Ser Cys Val Thr Thr Met Ala Lys Asn Lys Pro 
275 280 285 

ACA CTG GAC ATT GAA CTC TTG AAG ACG GAG GTC ACA AAC CCT GCA GTT 
Thr Leu Asp lie Glu Leu Leu Lys Thr Glu Val Thr Asn Pro Ala Val 
290 295 300 

CTG CGT AAA TTG TGC ATT GAA GCT AAA ATA TCA AAC ACC ACC ACC GAT 

Leu Arg Lys Leu Cys lie Glu Ala Lys lie Ser Asn Thr Thr Thr Asp 
305 310 315 320 

TCG AGA TGT CCA ACA CAA GGA GAA GCC ACA CTG GTG GAA GAA CAA GAC 
Ser Arg Cys Pro Thr Gin Gly Glu Ala Thr Leu Val Glu Glu Gin Asp 

325 330 335 

GCG AAC TTT GTG TGC CGA CGA ACG TTC GTG GAC AGA GGC TGG GGC AAT 
Ala Asn Phe Val Cys Arg Arg Thr Phe Val Asp Arg Gly Trp Gly Asn 

340 345 350 



816 



864 



912 



960 



1008 



1056 



GGC TGT GGG CTA TTC GGA AAA GGT AGT CTA ATA ACG TGT GCC AAG TTT 1104 

Gly Cys Gly Leu Phe Gly Lys Gly Ser Leu lie Thr Cys Ala Lys Phe 
355 360 365 

AAG TGT GTG ACA AAA CTA GAA GGA AAG ATA GCT CAA TAT GAA AAC CTA 1152 
Lys Cys Val Thr Lys Leu Glu Gly Lys He Ala Gin Tyr Glu Asn Leu 
370 375 380 

AAA TAT TCA GTG ATA GTC ACC GTC CAC ACT GGA GAT CAG CAC CAG GTG 1200 
Lys Tyr Ser Val He Val Thr Val His Thr Gly Asp Gin His Gin Val 
385 390 395 400 

GGA AAT GAG ACT ACA GAA CAT GGA ACA ACT GCA ACC ATA ACA CCT CAA 1248 
Gly Asn Glu Thr Thr Glu His Gly Thr Thr Ala Thr He Thr Pro Gin 

405 410 415 

GCT CCT ACG TCG GAA ATA CAG CTG ACC GAC TAC GGA ACC CTT ACA TTA 1296 
Ala Pro Thr Ser Glu He Gin Leu Thr Asp Tyr Gly Thr Leu Thr Leu 

420 425 430 

GAT TGT TCA CCT AGG ACA GGG CTA GAT TTT AAC GAG ATG GTG TTG CTG 1344 

Asp Cys Ser Pro Arg Thr Gly Leu Asp Phe Asn Glu Met Val Leu Leu 
435 ' 440 445 

ACA ATG AAA AAG AAA TCA TGG CTT GTC CAC AAA CAG TGG TTT CTA GAC 1392 

Thr Met Lys Lys Lys Ser Trp Leu Val His Lys Gin Trp Phe Leu Asp 
450 455 460 

TTA CCA CTG CCT TGG ACC TCT GGG GCT TTA ACA TCC CAA GAG ACT TGG 1440 
Leu Pro Leu Pro Trp Thr Ser Gly Ala Leu Thr Ser Gin Glu Thr Trp 
465 470 475 480 

AAC AGA CAA GAT TTA CTG GTC ACA TTT AAG ACA GCT CAT GCA AAG AAG 1488 
Asn Arg Gin Asp Leu Leu Val Thr Phe Lys Thr Ala His Ala Lys Lys 

485 490 495 
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CAG GAA GTA GTC GTA CTA GGA TCA CAA GAA GGA GCA ATG CAC ACT GCG 1536 

Gin Glu Val Val Val Leu Gly Ser Gin Glu Gly Ala Met His Thr Ala 

500 505 510 

CTG ACT GGA GCG ACA GAA ATC CAA .ACG TCA GGA ACG ACA ACA ATT TTC 1584 
Leu Thr Gly Ala Thr Glu He Gin Thr Ser Gly Thr Thr Thr He Phe 
515 520 525 

GCA GGA CAC CTA AAA TGC AGA CTA AAA ATG GAC AAA CTA ACT TTA AAA 1632 
Ala Gly His Leu Lys Cys Arg Leu Lys Met Asp Lys Leu Thr Leu Lys 
530 535 540 

GGG ATG TCA TAT GTG ATG TGC ACA GGC TCA TTC AAG TTA GAG AAA GAA 1680 
Gly Met Ser Tyr Val Met Cys Thr Gly Ser Phe Lys Leu Glu Lys Glu 
545 550 555 560 

GTG GCT GAG ACC CAG CAT GGA ACT GTT CTG GTG CAG GTT AAA TAT GAA 1728 
Val Ala Glu Thr Gin His Gly Thr Val Leu Val Gin Val Lys Tyr Glu 

565 570 575 

GGA ACA GAC GCA CCA TGC AAG ATT CCC TTT TCG ACC CAA GAT GAG AAA 1776 
Gly Thr Asp Ala Pro Cys Lys He Pro Phe Ser Thr Gin Asp Glu Lys 

580 585 590 

GGA GCA ACC CAG AAT GGG AGA TTA ATA ACA GCC AAC CCC ATA GTC ACT 1824 
Gly Ala Thr Gin Asn Gly Arg Leu He Thr Ala Asn Pro He Val Thr 
595 600 605 

GAC AAA GAA AAA CCA GTC AAT ATT GAG GCA GAA CCA CCC TTT GGT GAG 1872 
Asp Lys Glu Lys Pro Val Asn He Glu Ala Glu Pro Pro Phe Gly Glu 
610 615 620 

AGC TAC ATC GTG GTA GGA GCA GGT GAA AAA GCT TTG AAA CTA AGC TGG 1920 
Ser Tyr He Val Val Gly Ala Gly Glu Lys Ala Leu Lys Leu Ser Trp 
625 630 635 640 

TTC AAG AAA GGA AGC AGC ATA GGG AAA ATG TTT GAA GCA ACT GCC CGA 1968 
Phe Lys Lys Gly Ser Ser He Gly Lys Met Phe Glu Ala Thr Ala Arg 

645 650 655 

GGA GCA CGA AGG ATG GCC ATT CTG GGA GAC ACC GCA TGG GAC TTC GGT 2016 
Gly Ala Arg Arg Met Ala He Leu Gly Asp Thr Ala Trp Asp Phe Gly 

660 665 670 

TCT ATA GGA GGA GTG TTC ACG TCT ATG GGA AAA CTG GTA CAC CAG GTT 2064 
Ser He Gly Gly Val Phe Thr Ser Met Gly Lys Leu Val His Gin Val 
675 680 685 

TTT GGA ACT GCA TAT GGA GTT TTG TTT AGC GGA GTT TCT TGG ACC ATG 2112 
Phe Gly Thr Ala Tyr Gly Val Leu Phe Ser Gly Val Ser Trp Thr Met 
690 695 700 

AAA ATA GGA ATA GGG ATT CTG CTG ACA TGG CTA GGA TTA AAT TCA AGG 2160 
Lys He Gly He Gly He Leu Leu Thr Trp Leu Gly Leu Asn Ser Arg 
705 710 715 720 

AAC ACG TCC CTT TCG GTG ATG TGC ATC GCA GTT GGC ATG GTC ACA CTG 2208 
Asn Thr Ser Leu Ser Val Met Cys He Ala Val Gly Met Val Thr Leu 

725 730 735 
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TAC CTA GGA GTC ATG GTT CAG GCA GAT TCG GGA TGT GTA ATC AAC TGG 2256 

Tyr Leu Gly Val Met Val Gin Ala Asp Ser Gly Cys Val lie Asn Trp 

740 745 750 

AAA GGC AGA GAA CTT AAA TGT GGA AGC GGC ATT TTT GTC ACT AAT GAA 2304 

Lys Gly Arg Glu Leu Lys Cys Gly Ser Gly He Phe Val Thr Asn Glu 
755 760 765 

GTT CAC ACT TGG ACA GAG CAA TAC AAA TTC CAG GCT GAC TCC CCC AAG 2352 

Val His Thr Trp Thr Glu Gin Tyr Lys Phe Gin Ala Asp Ser Pro Lys 
770 775 780 

AGA CTA TCA GCA GCC ATT GGG AAG GCA TGG GAG GAG GGT GTG 2394 
Arg Leu Ser Ala Ala He Gly Lys Ala Trp Glu Glu Gly Val 
785 790 795 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2145 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1...2145 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
AAG GTC TTA AAA GGC TTC AAG AAG GAG ATC TCA AAC ATG CTG AGC ATT 48 

Lys Val Leu Lys Gly Phe Lys Lys Glu He Ser Asn Met Leu Ser He 
1 5 10 15 

ATC AAC AAA CGG AAA AAG ACA TCG CTC TGT CTC ATG ATG ATG TTA CCA 96 
He Asn Lys Arg Lys Lys Thr Ser Leu Cys Leu Met Met Met Leu Pro 

20 25 30 

GCA ACA CTT GCT TTC CAC TTA ACT TCA CGA GAT GGA GAG CCG CGC ATG 144 

Ala Thr Leu Ala Phe His Leu Thr Ser Arg Asp Gly Glu Pro Arg Met 
35 40 45 

ATT GTG GGG AAG AAT GAA AGA GGA AAA TCC CTA CTT TTC AAG ACA GCC 192 
He Val Gly Lys Asn Glu Arg Gly Lys Ser Leu Leu Phe Lys Thr Ala 
50 55 60 

TCT GGA ATC AAC ATG TGC ACA CTC ATA GCT ATG GAT CTG GGA GAG ATG 240 
Ser Gly He Asn Met Cys Thr Leu He Ala Met Asp Leu Gly Glu Met 
65 70 75 80 
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TGT GAT GAC ACG GTC ACT TAC AAA TGC CCC CAC ATT ACC GAA GTG GAG 

Cys Asp Asp Thr Val Thr Tyr Lys Cys Pro His lie Thr Glu Val Glu 

85 90 95 



288 



CCT GAA GAC ATT GAC TGC TGG TGC AAC CTT ACA TCG ACA TGG GTG ACT 
Pro Glu Asp lie Asp Cys Trp Cys Asn Leu Thr Ser Thr Trp Val Thr 

100 105 110 



336 



TAT GGA ACA TGC AAT CAA GCT GGA GAG CAT AGA CGC GAT AAG AGA TCA 384 
Tyr Gly Thr Cys Asn Gin Ala Gly Glu His Arg Arg Asp Lys Arg Ser 
115 120 125 

GTG GCG TTA GCT CCC CAT GTT GGC ATG GGA CTG GAC ACA CGC ACT CAA 432 
Val Ala Leu Ala Pro His Val Gly Met Gly Leu Asp Thr Arg Thr Gin 
130 135 140 



ACC TGG ATG TCG GCT GAA GGA GCT TGG AGA CAA GTC GAG AAG GTA GAG 480 
Thr Trp Met Ser Ala Glu Gly Ala Trp Arg Gin Val Glu Lys Val Glu 
145 150 155 160 

ACA TGG GCC CTT AGG CAC CCA GGG TTT ACC ATA CTA GCC CTA TTT CTT 528 
Thr Trp Ala Leu Arg His Pro Gly Phe Thr lie Leu Ala Leu Phe Leu 

165 170 175 

GCC CAT TAC ATA GGC ACT TCC TTG ACC CAG AAA GTG GTT ATT TTT ATA 576 
Ala His Tyr He Gly Thr Ser Leu Thr Gin Lys Val Val He Phe He 

180 185 190 

CTA TTA ATG CTG GTT ACC CCA TCC ATG ACA ATG AGA TGT GTA GGA GTA 624 
Leu Leu Met Leu Val Thr Pro Ser Met Thr Met Arg Cys Val Gly Val 
195 200 205 

GGA AAC AGA GAT TTT GTG GAA GGC CTA TCG GGA GCT ACG TGG GTT GAC 672 
Gly Asn Arg Asp Phe Val Glu Gly Leu Ser Gly Ala Thr Trp Val Asp 
210 215 220 

GTG GTG CTC GAG CAC GGT GGG TGT GTG ACT ACC ATG GCT AAG AAC AAG 720 
Val Val Leu Glu His Gly Gly Cys Val Thr Thr Met Ala Lys Asn Lys 
225 230 235 240 

CCC ACG CTG GAC ATA GAG CTT CAG AAG ACC GAG GCC ACC CAA CTG GCG 768 
Pro Thr Leu Asp He Glu Leu Gin Lys Thr Glu Ala Thr Gin Leu Ala 

245 250 255 

ACC CTA AGG AAG CTA TGC ATT GAG GGA AAA ATT ACC AAC ATA ACA ACC 816 
Thr Leu Arg Lys Leu Cys lie Glu Gly Lys He Thr Asn He Thr Thr 

260 265 270 

GAC TCA AGA TGT CCC ACC CAA GGG GAA GCG ATT TTA CCT GAG GAG CAG 864 
Asp Ser Arg Cys Pro Thr Gin Gly Glu Ala lie Leu Pro Glu Glu Gin 
275 280 285 

GAC CAG AAC TAC GTG TGT AAG CAT ACA TAC GTG GAC AGA GGC TGG GGA 912 
Asp Gin Asn Tyr Val Cys Lys His Thr Tyr Val Asp Arg Gly Trp Gly 
290 295 300 

AAC GGT TGT GGT TTG TTT GGC AAG GGA AGC TTG GTG ACA TGC GCG AAA 960 
Asn Gly Cys Gly Leu Phe Gly Lys Gly Ser Leu Val Thr Cys Ala Lys 
305 310 315 320 
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TTT CAA TGT TTA GAA TCA ATA GAG GGA AAA GTG GTG CAA CAT GAG AAC 1008 
Phe Gin Cys Leu Glu Ser lie Glu Gly Lys Val Val Gin His Glu Asn 

325 330 335 

CTC AAA TAC ACC GTC ATC ATC ACA GTG CAC ACA GGA GAC CAA CAC CAG 1056 
Leu Lys Tyr Thr Val He He Thr Val His Thr Gly Asp Gin His Gin 

340 345 350 

GTG GGA AAT GAA ACG CAG GGA GTC ACG GCT GAG ATA ACA CCC CAG GCA 1104 
Val Gly Asn Glu Thr Gin Gly Val Thr Ala Glu He Thr Pro Gin Ala 
355 360 365 

TCA ACC GCT GAA GCC ATT TTA CCT GAA TAT GGA ACC CTC GGG CTA GAA 1152 

Ser Thr Ala Glu Ala He Leu Pro Glu Tyr Gly Thr Leu Gly Leu Glu 
370 375 380 

TGC TCA CCA CGG ACA GGT TTG GAT TTC AAT GAA ATG ATC TCA TTG ACA 1200 
Cys Ser Pro Arg Thr Gly Leu Asp Phe Asn Glu Met He Ser Leu Thr 
385 390 395 400 

ATG AAG AAC AAA GCA TGG ATG GTA CAT AGA CAA TGG TTC TTT GAC TTA 1248 
Met Lys Asn Lys Ala Trp Met Val His Arg Gin Trp Phe Phe Asp Leu 

405 410 415 

CCC CTA CCA TGG ACA TCA GGA GCT ACA GCA GAA ACA CCA ACT TGG AAC 1296 
Pro Leu Pro Trp Thr Ser Gly Ala Thr Ala Glu Thr Pro Thr Trp Asn 

420 425 430 

AGG AAA GAG CTT CTT GTG ACA TTT AAA AAT GCA CAT GCA AAA AAG CAA 1344 

Arg Lys Glu Leu Leu Val Thr Phe Lys Asn Ala His Ala Lys Lys Gin 
435 440 445 

GAA GTA GTT GTT CTT GGA TCA CAA GAG GGA GCA ATG CAT ACA GCA CTG 1392 

Glu Val Val Val Leu Gly Ser Gin Glu Gly Ala Met His Thr Ala Leu 
450 455 460 

ACA GGA GCT ACA GAG ATC CAA ACC TCA GGA GGC ACA AGT ATC TTT GCG 1440 
Thr Gly Ala Thr Glu He Gin Thr Ser Gly Gly Thr Ser He Phe Ala 
465 470 475 480 

GGG CAC TTA AAA TGT AGA CTC AAG ATG GAC AAA TTG GAA CTC AAA GGG 1488 
Gly His Leu Lys Cys Arg Leu Lys Met Asp Lys Leu Glu Leu Lys Gly 

485 490 495 

ATG AGC TAT GCA ATG TGC TTG GGT AGC TTT GTG TTG AAG AAA GAA GTC 1536 

Met Ser Tyr Ala Met Cys Leu Gly Ser Phe Val Leu Lys Lys Glu Val 

500 505 510 

TCC GAA ACG CAG CAT GGG ACA ATA CTC ATT AAG GTT GAG TAC AAA GGG 1584 
Ser Glu Thr Gin His Gly Thr He Leu He Lys Val Glu Tyr Lys Gly 
515 520 525 

AAA GAT GCA CCC TGC AAG ATT CCT TTC TCC ACG GAG GAT GGA CAA GGA 1632 
Lys Asp Ala Pro Cys Lys lie Pro Phe Ser Thr Glu Asp Gly Gin Gly 
530 535 540 

AAA GCT CAC AAT GGC AGA CTG ATC ACA GCC AAT CCA GTG GTG ACC AAG 1680 
Lys Ala His Asn Gly Arg Leu He Thr Ala Asn Pro Val Val Thr Lys 
545 550 555 560 
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AAG GAG GAG CCT GTC AAC ATT GAG GCT GAA CCT CCT TTT GGA GAA AGT . 1728 

Lys Glu Glu Pro Val Asn lie Glu Ala Glu Pro Pro Phe Gly Glu Ser 

565 570 575 

AAC ATA GTA ATT GGA ATT GGA GAC AAA GCC CTG AAA ATC AAC TGG TAC 1776 

Asn lie Val lie Gly He Gly Asp Lys Ala Leu Lys He Asn Tip Tyr 

580 585 590 

AAG AAG GGA AGC TCG ATT GGG AAG ATG TTC GAG GCT ACT GCC AGA GGT 1824 
Lys Lys Gly Ser Ser He Gly Lys Met Phe Glu Ala Thr Ala Arg Gly 
595 600 605 

GCA AGG CGC ATG GCC ATC TTG GGA GAC ACA GCC TGG GAC TTT GGA TCA 1872 
Ala Arg Arg Met Ala He Leu Gly Asp Thr Ala Trp Asp Phe Gly Ser 
610 615 620 

GTG GGT GGT GTT TTG AAT TCA TTA GGG AAA ATG GTC CAC CAA ATA TTT 1920 
Val Gly Gly Val Leu Asn Ser Leu Gly Lys Met Val His Gin He Phe 
625 630 635 640 

GGG AGT GCT TAC ACA GCC CTA TTT GGT GGA GTC TCC TGG ATG ATG AAA 1968 
Gly Ser Ala Tyr Thr Ala Leu Phe Gly Gly Val Ser Trp Met Met Lys 

645 650 655 

ATT GGA ATA GGT GTC CTC TTA ACC TGG ATA GGG TTG AAC TCA AAA AAT 2016 
He Gly He Gly Val Leu Leu Thr Trp He Gly Leu Asn Ser Lys Asn 

660 665 670 

ACT TCT ATG TCA TTT TCA TGC ATC GCG ATA GGA ATC ATT ACA CTC TAT 2064 
Thr Ser Met Ser Phe Ser Cys He Ala He Gly He He Thr Leu Tyr 
675 680 685 

CTG GGA GCC GTG GTG CAA GCT GAC ATG GGG TGT GTC ATA AAC TGG AAA 2112 
Leu Gly Ala Val Val Gin Ala Asp Met Gly Cys Val He Asn Trp Lys 
690 695 700 

GGC AAA GAA CTC AAA TGT GGA AGT GGA ATT TTC 2145 
Gly Lys Glu Leu Lys Cys Gly Ser Gly He Phe 
705 710 715 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2175 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME/ KEY: Coding Sequence 

(B) LOCATION: 1...2175 
(D) OTHER INFORMATION: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

ATT CTG AAG AGA TGG GGA CAG TTG AAG AAA AAT AAG GCC ATC AGG ATA 48 
lie Leu Lys Arg Trp Gly Gin Leu Lys Lys Asn Lys Ala lie Arg lie 
1 5 10 15 

♦ 

CTG ATT GGA TTC AGG AAG GAG ATA GGC CGC ATG CTG AAC ATC TTG AAC 96 
Leu lie Gly Phe Arg Lys Glu lie Gly Arg Net Leu Asn lie Leu Asn 

20 25 30 

GGG AGA AAA AGG TCA ACG ATA ACA TTG CTG TGC TTG ATT CCC ACC GTA 1 44 

Gly Arg Lys Arg Ser Thr lie Thr Leu Leu Cys Leu lie Pro Thr Val 
35 40 45 

ATG GCG TTT CAC TTG TCA ACA AGA GAT GGC GAA CCC CTC ATG ATA GTG 192 

Met Ala Phe His Leu Ser Thr Arg Asp Gly Glu Pro Leu Met lie Val 
50 55 60 



GCA AAA CAT GAA AGG GGG AGA CCT CTC TTG TTT AAG ACA ACA GAG GGG 240 
Ala Lys His Glu Arg Gly Arg Pro Leu Leu Phe Lys Thr Thr Glu Gly 
65 70 75 80 

ATC AAC AAA TGC ACT CTC ATT GCC ATG GAC TTG GGT GAA ATG TGT GAG 288 

lie Asn Lys Cys Thr Leu lie Ala Met Asp Leu Gly Glu Met Cys Glu 

85 90 95 



GAC ACT GTC ACG TAT AAA TGC CCC TTA CTG GTC AAT ACC GAA CCT GAA 336 
Asp Thr Val Thr Tyr Lys Cys Pro Leu Leu Val Asn Thr Glu Pro Glu 

100 105 110 

GAC ATT GAT TGC TGG TGC AAT CTC ACG TCT ACC TGG GTC ACA TAT GGG 384 
Asp lie Asp Cys Trp Cys Asn Leu Thr Ser Thr Trp Val Thr Tyr Gly 
115 * 120 125 

ACA TAC ACC CAG AGC GGA GAA CGG AGA CGA GAG AAG CGC TCA GTA GCT 432 
Thr Tyr Thr Gin Ser Gly Glu Arg Arg Arg Glu Lys Arg Ser Val Ala 
130 135 140 

TTA ACA CCA CAT TCA GGA ATG GGA TTG GAA ACA AGA GCT GAG ACA TGG 480 

Leu Thr Pro His Ser Gly Met Gly Leu Glu Thr Arg Ala Glu Thr Trp 
145 150 155 160 

ATG TCA TCG GAA GGG GCT TGG AAG CAT GCT CAG AGA GTA GAG AGC TGG 528 
Met Ser Ser Glu Gly Ala Trp Lys His Ala Gin Arg Val Glu Ser Trp 

165 170 175 

ATA CTC AGA AAC CCA GGA TTC GCG CTC TTG GCA GGA TTT ATG GCT TAT 576 
lie Leu Arg Asn Pro Gly Phe Ala Leu Leu Ala Gly Phe Met Ala Tyr 

180 185 190 

ATG ATT GGG CAA ACA GGA ATC CAG CGA ACT GTC TTC TTT GTC CTA ATG 624 
Met He Gly Gin Thr Gly He Gin Arg Thr Val Phe Phe Val Leu Met 
195 200 205 

ATG CTG GTC GCC CCA TCC TAC GGA ATG CGA TGC GTA GGA GTA GGA AAC 672 
Met Leu Val Ala Pro Ser Tyr Gly Met Arg Cys Val Gly Val Gly Asn 
210 215 220 
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AGA GAC TTT GTG GAA GGA GTC TCA GGT GGA GCA TGG GTC GAT CTG GTG 

Arg Asp Phe Val Glu Gly Val Ser Gly Gly Ala Trp Val Asp Leu Val 
225 230 235 240 

CTA GAA CAT GGA GGA TGC GTC ACA ACC ATG GCC CAG GGA AAA CCA ACC 
Leu Glu His Gly Gly Cys Val Thr Thr Met Ala Gin Gly Lys Pro Thr 

245 250 255 

TTG GAT TTT GAA CTG ACT AAG ACA ACA GCC AAG GAA GTG GCT CTG TTA 
Leu Asp Phe Glu Leu Thr Lys Thr Thr Ala Lys Glu Val Ala Leu Leu 

260 265 270 

AGA ACC TAT TGC ATT GAA GCC TCA ATA TCA AAC ATA ACC ACG GCA ACA 
Arg Thr Tyr Cys He Glu Ala Ser He Ser Asn He Thr Thr Ala Thr 
275 280 285 

AGA TGT CCA ACG CAA GGA GAG CCT TAT CTA AAA GAG GAA CAA GAC CAA 
Arg Cys Pro Thr Gin Gly Glu Pro Tyr Leu Lys Glu Glu Gin Asp Gin 
290 295 300 

CAG TAC ATT TGC CGG AGA GAT GTG GTA GAC AGA GGG TGG GGC AAT GGC 

Gin Tyr He Cys Arg Arg Asp Val Val Asp Arg Gly Trp Gly Asn Gly 
305 310 315 320 

TGT GGC TTG TTT GGA AAA GGA GGA GTT GTG ACA TGT GCG AAG TTT TCA 
Cys Gly Leu Phe Gly Lys Gly Gly Val Val Thr Cys Ala Lys Phe Ser 

325 330 335 

TGT TCG GGG AAG ATA ACA GGC AAT TTG GTC CAA ATT GAG AAC CTT GAA 
Cys Ser Gly Lys He Thr Gly Asn Leu Val Gin He Glu Asn Leu Glu 

340 345 350 

TAC ACA GTG GTT GTA ACA GTC CAC AAT GGA GAC ACC CAT GCA GTA GGA 
Tyr Thr Val Val Val Thr Val His Asn Gly Asp Thr His Ala Val Gly 
355 360 365 

AAT GAC ACA TCC AAT CAT GGA GTT ACA GCC ACG ATA ACT CCC AGG TCA 

Asn Asp Thr Ser Asn His Gly Val Thr Ala Thr He Thr Pro Arg Ser 
370 375 380 

CCA TCG GTG GAA GTC AAA TTG CCG GAC TAT GGA GAA CTA ACA CTC GAT 
Pro Ser Val Glu Val Lys Leu Pro Asp Tyr Gly Glu Leu Thr Leu Asp 
385 390 395 400 

TGT GAA CCC AGG TCT GGA ATT GAC TTT AAT GAG ATG ATT CTG ATG AAA 
Cys Glu Pro Arg Ser Gly He Asp Phe Asn Glu Met He Leu Met Lys 

405 410 415 

ATG AAA AAG AAA ACA TGG CTT GTG CAT AAG CAA TGG TTT TTG GAT CTA 
Met Lys Lys Lys Thr Trp Leu Val His Lys Gin Trp Phe Leu Asp Leu 

420 425 430 

CCT CTA CCA TGG ACA GCA GGA GCA GAC ACA TCA GAG GTT CAC TGG AAT 
Pro Leu Pro Trp Thr Ala Gly Ala Asp Thr Ser Glu Val His Trp Asn 
435 440 445 

TAC AAA GAG AGA ATG GTG ACA TTT AAG GTT CCT CAT GCC AAG AGA CAG 
Tyr Lys Glu Arg Met Val Thr Phe Lys Val Pro His Ala Lys Arg Gin 
450 455 460 



720 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 
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GAT GTG ACA GTG CTG GGA TCT CAG GAA GGA GCC ATG CAT TCT GCC CTC 1440 
Asp Val Thr Val Leu Gly Ser Gin Glu Gly Ala Met His Ser Ala Leu 
465 470 475 480 

GCT GGA GCC ACA GAA GTG GAC TCC GGT GAT GGA AAT CAC ATG TTT GCA 1488 
Ala Gly Ala Thr Glu Val Asp Ser Gly Asp Gly Asn His Met Phe Ala 

485 490 495 

GGA CAT CTC AAG TGC AAA GTC CGT ATG GAG AAA TTG AGA ATC AAG GGA 1536 
Gly His Leu Lys Cys Lys Val Arg Met Glu Lys Leu Arg lie Lys Gly 

500 505 510 

ATG TCA TAC ACG ATG TGT TCA GGA AAG TTC TCA ATT GAC AAA GAG ATG 1584 
Met Ser Tyr Thr Met Cys Ser Gly Lys Phe Ser lie Asp Lys Glu Met 
515 520 525 

GCA GAA ACA CAG CAT GGG ACA ACA GTG GTG AAA GTC AAG TAT GAA GGT 1632 
Ala Glu Thr Gin His Gly Thr Thr Val Val Lys Val Lys Tyr Glu Gly 
530 535 540 

GCT GGA GCT CCG TGT AAA GTC CCC ATA GAG ATA AGA GAT GTG AAC AAG 1680 
Ala Gly Ala Pro Cys Lys Val Pro lie Glu He Arg Asp Val Asn Lys 
545 550 555 560 

AAA AAA GTG GTT GGG CGT ATC ATC TCA TCC ACC CCT TTG GCT GAG AAT 1728 

Lys Lys Val Val Gly Arg He He Ser Ser Thr Pro Leu Ala Glu Asn 

565 570 575 

ACC AAC AGT GCA ACC AAC ATA GAG TTA GAA CCC CCC TTT GGG GAC AGC 1776 
Thr Asn Ser Ala Thr Asn He Glu Leu Glu Pro Pro Phe Gly Asp Ser 

580 585 590 

TAC ATA GTG ATA GGT GTT GGA AAC AGT GCA TTA ACA CTC CAT TGG TTC 1824 

Tyr He Val He Gly Val Gly Asn Ser Ala Leu Thr Leu His Trp Phe 
595 600 605 

AGG AAA GGG AGT TCC ATT GGC AAG ATG TTT GAG TCC ACA TAC AGA GGT 1872 
Arg Lys Gly Ser Ser He Gly Lys Met Phe Glu Ser Thr Tyr Arg Gly 
610 615 620 

GCA AAA CGA ATG GCC ATT CTA GGT GAA ACA GCT TGG GAT TTT GGT TCC 1920 

Ala Lys Arg Met Ala He Leu Gly Glu Thr Ala Trp Asp Phe Gly Ser 
625 630 635 640 

GTT GGT GGA CTG TTC ACA TCA TTG GGA AAG GCT GTG CAC CAG GTT TTT 1968 

Val Gly Gly Leu Phe Thr Ser Leu Gly Lys Ala Val His Gin Val Phe 

645 650 655 

GGA AGT GTG TAT ACA ACC ATG TTT GGA GGA GTC TCA TGG ATG ATT AGA 2016 

Gly Ser Val Tyr Thr Thr Met Phe Gly Gly Val Ser Trp Met He Arg 

660 665 670 

ATC CTA ATT GGG TTC CTA GTG TTG TGG ATT GGC ACG AAC TCA AGG AAC 2064 
He Leu lie Gly Phe Leu Val Leu Trp He Gly Thr Asn Ser Arg Asn 
675 680 685 

ACT TCA ATG GCT ATG ACG TGC ATA GCT GTT GGA GGA ATC ACT CTG TTT 2112 
Thr Ser Met Ala Met Thr Cys He Ala Val Gly Gly He Thr Leu Phe 
690 695 700 
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CTG GGC TTC ACA GTT CAA GCA GAG ATG GGT TGT GTG GTG TCA TGG AGT 

Leu Gly Phe Thr Val Gin Ala Glu Met Gly Cys Val Val Ser Trp. Ser 
705 710 715 720 



2160 



GGG AAA GAA TTG AGG 
Gly Lys Glu Leu Arg 

725 



2175 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
CACTACCGCA AGGTAGAGAG CTCGGCATTG CCTCTTGGTG 40 
(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
GTGATGGCGT TCCATCTCTC GAGCCGTAAC GGAGAACCAC 40 
(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129 
GCCCTGGCGT TCCATCTCTC GAGCCGAGGG GGAGAGCCGC 
(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 
ACACTTGCTT TCCACCTCTC GAGCCGAGAT GGAGAGCCGC 
(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131 
GTAATGGCGT TTCACCTCTC GAGCAGAGAT GGCGAACCCC 
(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 
CCTATCCTTA CTTAAGATCT TCGTGGAGTG ACAGAC 
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(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133 
GGATAGGAAT GAATTCTAGA AGCACCTCAC TGTCTG 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134 
CCGCAGAGAT CGTTTTCCTG CCTGCATGAT TCC 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135 
CCGATCCTAA TTTAAGATCT TTGTGCAGGG AAAGCC 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136 
CCTATCCCAA CTTGAGATCT TTATGAAGAT ACAGTA 

(2) INFORMATION FOR SEQ ID NO: 1 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137 
CCTAACCGTG CTTGAGATCT TTGTGAAGTT ACCGAC 
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WHAT IS CLAIMED IS: 

1. A guadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising a 
DEN- 2 PDK-53 infectious clone-derived virus. 
5 2 . A quadravalent vaccine providing immunity 

against all four serotypes of dengue virus comprising a 
chimeric DEN-2/1 virus. 

3. A quadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising a 

10 chimeric DEN-2/3 virus. 

4 . A quadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising a 
chimeric DEN-2/4 virus. 

5 . A quadravalent vaccine providing immunity 

15 against all four serotypes of dengue virus comprising DEN 
2 PDK-53 infectious clone -derived and chimeric DEN-2/1, 
DEN-2/3, and DEN-2/4 viruses. 

6 . A method of immunization in which a desired 
immune response is produced against all four serotypes of 

20 dengue virus comprising the step of administering to a 
subject a quadravalent vaccine comprising DEN- 2 PDK-53 
infectious clone-derived and chimeric DEN-2/1, DEN-2/3, 
and DEN-2/4 viruses. 

7. A composition of matter comprising a full 

25 genome-length infectious cDNA clone for a DEN -2 virus, 
strain 16681. 

8. A composition of matter comprising a full 
genome-length infectious cDNA clone for a DEN- 2 virus of 
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strain characterized as replicating to high titer in cell 
culture . 

9. A composition of matter comprising a full 
genome -length infectious cDNA clone for a DEN- 2 virus, 
5 strain 16681, having the identifying characteristics of 
ATCC 69826. 

10* A composition of matter comprising a full 
genome- length infectious cDNA clone for a DEN- 2 virus, 
strain 16681, attenuated derivative, PDK-53. 
10 11. A composition of matter comprising a full 

genome - length infectious cDNA clone for a DEN- 2 virus 
attenuated derivative, characterized as replicating to 
high titer in cell culture. 

12. A composition of matter comprising a full 

15 genome- length infectious cDNA clone for a DEN- 2 virus, 
strain 16681, attenuated derivative, PDK-53, having the 
identifying characteristics of ATCC 69825. 

13 . A composition of matter comprising a full 
genome-length infectious cDNA clone of a chimeric DEN-2/1 

20 virus, wherein said virus is characterized as the 

expressing prM and E genes of a DEN-1 attenuated virus in 
the context of the nonstructural genes of the DEN- 2 PDK-53 
virus . 

14. The composition of matter of Claim 13, wherein 
25 said DEN-1 attenuated virus is DEN-1 PDK-13. 

15. A composition of matter comprising a full 
genome -length infectious cDNA clone of a chimeric DEN- 2 
virus, wherein said virus is characterized as expressing 
the antigenicity of a DEN-1 attenuated virus. 
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16 . A composition of matter comprising a full 
genome-length infectious cDNA clone of a chimeric DEN-2/3 
virus, wherein said virus is characterized as expressing 
the prM and E genes of a DEN- 3 attenuated virus in the 

5 context of the nonstructural genes of the DEN- 2 PDK-53 
virus . 

17. The composition of matter of Claim 16, wherein 
said DEN- 3 attenuated virus is DEN- 3 PGMK3 0 /FRhL- 3 . 

18. A composition of matter comprising a full 

10 genome-length infectious cDNA clone of a chimeric DEN-2 
virus, wherein said virus is characterized as expressing 
the antigenicity of a DEN- 3 attenuated virus. 

19. A composition of matter comprising a full 
genome-length infectious cDNA clone of a chimeric DEN-2/4 

15 virus, wherein said virus is characterized as expressing 
the prM and E genes of a DEN- 4 attenuated virus in the 
context of the nonstructural genes of the DEN-2 PDK-53 
virus. 

20. The composition of matter of Claim 19, wherein 
20 said DEN- 4 attenuated virus is DEN- 4 PDK-48. 

21. A composition of matter comprising a full 
genome-length infectious cDNA clone of a chimeric DEN-2 
virus, wherein said virus is characterized as expressing 
the antigenicity of a DEN-4 attenuated virus. 

25 22. A genetic construct comprising a DNA sequence 

operably encoding the polyprotein of DEN-2 virus, strain 
16681. 

23. The genetic construct of Claim 22, wherein said 
polyprotein is the polyprotein encoded by the nucleotide 
30 sequence of SEQ ID N0:1. 
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24 . A genetic construct comprising a DNA sequence 
operably encoding at least one protein of DEN- 2 virus, 
strain 16681. 

25. The genetic construct of Claim 24, wherein said 
5 protein is a protein encoded by the nucleotide sequence of 

SEQ ID NO: 1. 

26 . A genetic construct comprising a DNA sequence 
operably encoding the polyprotein of DEN- 2 virus, strain 
16681, attenuated derivative, PDK-53. 

10 27. The genetic construct of Claim 26, wherein said 

polyprotein is the polyprotein encoded by the nucleotide 
sequence of SEQ ID NO: 2. 

28. A genetic construct comprising a DNA sequence 
operably encoding at least one protein of DEN- 2 virus, 

15 strain 16681, attenuated derivative, PDK-53. 

29. The genetic construct of Claim 28, wherein said 
protein is a protein encoded by the nucleotide sequence of 
SEQ ID NO: 2. 

30. A genetic construct comprising a DNA sequence 

20 operably encoding at least one structural protein of DEN-1 
PDK-13. 

31. The genetic construct of Claim 30, wherein said 
structural protein is a structural protein encoded by the 
nucleotide sequence of SEQ ID NO: 124. 

25 32. A genetic construct comprising a DNA sequence 

operably encoding at least one structural protein of DEN -3 
PGMK30/FRhL-3. 

33. The genetic construct of Claim 32, wherein said 
structural protein is a structural protein encoded by the 

30 nucleotide sequence of SEQ ID NO: 125. 
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34. A genetic construct comprising a DNA sequence 
operably encoding at least one structural protein of DEN- 4 
PDK-48. 

35. The genetic construct of Claim 34, wherein said 
5 structural protein is a structural protein encoded by the 

nucleotide sequence of SEQ ID NO: 126. 

36. A host cell comprising the genetic construct of 
any of Claims 22-35. 
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